em network issues

Jack Vogel jfvogel at gmail.com
Sat Oct 21 02:40:52 UTC 2006


On 10/20/06, Bill Paul <wpaul at freebsd.org> wrote:
>
> [...]
>
> > > Another thing that might be handy is improving the watchdog timeout
> > > message so that it dumps the state of the ICR and ICM registers (and
> > > maybe some other interesting driver and/or device state). The timeout
> > > implies no interrupts were delivered for a Long Time (tm). If the
> > > ICM register indicates interrupts have been masked, then that means
> > > em_intr_fast() was triggered by and interrupt and it scheduled work,
> > > but that work never executed. If that really is what happened, then
> > > I can understand the watchdog error occuring. If that's _not_ what
> > > happened, them something else is screwed up.
> >
> > Jesse Brandeburg just did an interesting hack for the Linux driver, I
> > was considering trying to code an equivalent thing up for us. We
> > have evidence that on some AMD based systems there are writebacks
> > that get lost, since the TX cleanup relies on the DD being set you
> > are hosed when this happens. What he did was make a cleanup
> > routine that ONLY uses the head and tail pointers and NOT the done
> > bit. Then, in the watchdog routine, if there is evidence of this problem
> > it will switch the cleanup function pointer to this alternate clean code.
>
> Oho, I didn't realize the 8254x had producer/consumer indexes like this.
> Hm. But the documentation for the Transmit Descriptor Head register
> says:
>
>  "Reading the transmit descriptor head to determine which buffers
>   have been used (and can be returned to the memory pool) is not reliable."
>
> There's a similar notation for the Receive Descriptor Head register.
>
> I wonder what's unreliable about it.
>
> > At least one user that was having a problem has reported this solved
> > it. It may be one of the issues hitting us as well.
>
> Switching from testing the descriptor completion bits to using the
> consumer indexes should be pretty straightforward. It's worth a shot
> at any rate.
>

I have not yet looked at Jesse's code to see if he does anything fancy
but there is one other driver that I know of on our hardware (and no its
not for that so-called OS from Redmond) that has always done this
so it must not be THAT unreliable. It just isnt using the full capability
of the hardware, but if it works.... :)

Jesse's code is supposed to be on our driver site on sourceforge, I just
have been too busy to go look for it, but its public.

BTW, I got a Smartbits unit in my cubicle today, got software installed and
hardware almost there, not quite done yet. It sure can pump LOTS of packets
though :) Will report results as I get them.

Jack


More information about the freebsd-stable mailing list