cvs commit: src/sys/dev/em if_em.c

Pyun YongHyeon pyunyh at gmail.com
Tue Aug 22 09:45:16 UTC 2006


On Tue, Aug 22, 2006 at 12:22:37PM +0400, Ruslan Ermilov wrote:
 > Hi Pyun,
 > 
 > On Tue, Aug 22, 2006 at 02:32:48AM +0000, Pyun YongHyeon wrote:
 > > yongari     2006-08-22 02:32:48 UTC
 > > 
 > >   FreeBSD src repository
 > > 
 > >   Modified files:
 > >     sys/dev/em           if_em.c 
 > >   Log:
 > >   It seems that em(4) misses Tx completion interrupts under certain
 > >   conditions. The cause of missing Tx completion interrupts comes from
 > >   Tx interrupt moderation mechanism(delayed interrupts) or chipset bug.
 > >   If Tx interrupt moderation mechanism is the cause of false watchdog
 > >   timeout error we should have to fix all device drivers that have Tx
 > >   interrupt moderation capability. We may need more investigation
 > >   for this issue. Anyway, the fix is the same for both cases.
 > >   
 > >   This should fix occasional watchdog timeout errors seen on a few
 > >   systems.
 > >   
 > >   Reported by:    -net, Patrick M. Hausen < hausen AT punkt DOT de >
 > >   Tested by:      Patrick M. Hausen < hausen AT punkt DOT de >
 > >   
 > >   Revision  Changes    Path
 > >   1.133     +12 -0     src/sys/dev/em/if_em.c
 > > 
 > I agree this is a less painful way to recover, but it's still a
 > watchdog and it slows down the performance when it happens.  After
 > this commit, if there's a moderate number of missing Tx completion
 > interrupts (for some reason), even a diagnostic message won't be
 > printed.  This is bad -- users will "seem" to have working but
 > slow systems, without any indication of what causes this slowness.

It just reinvokes txeof handler and check whether there are pending Tx
descriptors in driver queue. If there are no pending Tx descriptors
it's false watchdog timeout and just return without resetting 
hardware. So there is no performance drop. Of course, if we are out of
Tx descriptors and missed Tx completion interrupts it would slow down
Tx process.
ATM I don't know what caused this missing Tx completion interrupt.
(chipset bug/Tx interrupt moderation or other bug)

 > I think a diagnostic message should still be printed in this case,
I have no objections on printing a diagnostic message. But if missing
Tx completion interrupts is normal consequences for these hardwares
it would give negative impresstion to users.

 > and adapter->watchdog_events should still be incrementd, we just
 > don't need to reinit the chip in this case.
 > 
adapter->watchdog_events is used to count output errors(if_oerrors).
If we know the watchog timeout is false we should not increment the
counter as we sucessfully transmitted it without errors.

Because it's hard to reproduce it I guess it only happens under
certain conditions. In addition we don't know how many Tx completion
interrupts are lost. If you think it should recover fast from the
above condition wihtout waiting for a watchdog timeout we could
embebd an em_txeof() into em_local_timer() to sweep up Tx
descriptors sucessfully transmitted.

-- 
Regards,
Pyun YongHyeon


More information about the cvs-src mailing list