em network issues

Bruce Evans bde at zeta.org.au
Thu Oct 19 07:07:09 UTC 2006


On Thu, 19 Oct 2006, Scott Long wrote:

> Bruce Evans wrote:

>>>> On Wed, 18 Oct 2006, Kris Kennaway wrote:
>>>>> I have been working with someone's system that has em shared with fxp,
>>>>> and a simple fetch over the em (e.g. of a 10 GB file of zeroes) is
>>>>> enough to produce watchdog timeouts after a few seconds.
>>>> 
>>>> em_intr_fast() has no locking whatsoever.  I would be very surprised
>>>> if it even seemed to work for SMP.  For UP, masking of CPU interrupts
>>>> (as is automatic in fast interrupt handlers) might provide sufficient
>>>> locking, ...
>> 
>> I barely noticed the point about it being shared.  With sharing, and
>> probably especially with fast and normal interrupt handlers sharing an
>> IRQ, locking is more needed.  There are many possibilities for races.
>> One likely one is:
>> - em interrupt task running.  Device interrupts are disabled, so the
>>   task thinks it won't be interfered with by the em interrupt handler.
>
> What interference are you talking about?  em_intr_fast changes no state
> in the driver softc (aside from the silly bookkeeping).   It only reads
> from one register, and writes to no registers or shared memory.

It disables interrupts.  To do that, it calls em_disable_intr().  The
hardware is simple enough for em_disable_intr() not to have to make
many state changes, but it certainly has to make at least 1 to work.
It uses several layers of macros which I think ends up doing a write
to 1 register in bus space.

>> - shared fxp interrupt.  The em interrupt handler is called.  Without
>>   any explicit synchonization, bad things may happen and apparently do.
>>   In the UP case, there is some implicit synchronization which may help
>>   but is hard to understand.
>
> Can you be more specific as to the 'bad things'?

Not very.  Maybe interrupts don't get reenabled as intended.  Then the
symptoms get mutated by watchdog timeouts.

Bruce


More information about the freebsd-net mailing list