em network issues

Scott Long scottl at samsco.org
Thu Oct 19 06:30:54 UTC 2006


Bruce Evans wrote:
> On Wed, 18 Oct 2006, Scott Long wrote:
> 
> [too much quoted; much deleted]
> 
>> Bruce Evans wrote:
>>> On Wed, 18 Oct 2006, Kris Kennaway wrote:
>>>
>>>> I have been working with someone's system that has em shared with fxp,
>>>> and a simple fetch over the em (e.g. of a 10 GB file of zeroes) is
>>>> enough to produce watchdog timeouts after a few seconds.
>>>>
>>>> As previously mentioned, changing the INTR_FAST to INTR_MPSAFE in the
>>>> driver avoids this problem.  However, others are seeing sporadic
>>>> watchdog timeouts at higher system load on non-shared em systems too.
>>>
>>> em_intr_fast() has no locking whatsoever.  I would be very surprised
>>> if it even seemed to work for SMP.  For UP, masking of CPU interrupts
>>> (as is automatic in fast interrupt handlers) might provide sufficient
>>> locking, ...
> 
> I barely noticed the point about it being shared.  With sharing, and
> probably especially with fast and normal interrupt handlers sharing an
> IRQ, locking is more needed.  There are many possibilities for races.
> One likely one is:
> - em interrupt task running.  Device interrupts are disabled, so the
>   task thinks it won't be interfered with by the em interrupt handler.

What interference are you talking about?  em_intr_fast changes no state
in the driver softc (aside from the silly bookkeeping).   It only reads
from one register, and writes to no registers or shared memory.

> - shared fxp interrupt.  The em interrupt handler is called.  Without
>   any explicit synchonization, bad things may happen and apparently do.
>   In the UP case, there is some implicit synchronization which may help
>   but is hard to understand.

Can you be more specific as to the 'bad things'?

Scott


More information about the freebsd-net mailing list