cvs commit: src/sys/dev/em if_em.c

Ruslan Ermilov ru at freebsd.org
Tue Aug 22 10:13:23 UTC 2006


On Tue, Aug 22, 2006 at 06:44:52PM +0900, Pyun YongHyeon wrote:
> On Tue, Aug 22, 2006 at 12:22:37PM +0400, Ruslan Ermilov wrote:
>  > I agree this is a less painful way to recover, but it's still a
>  > watchdog and it slows down the performance when it happens.  After
>  > this commit, if there's a moderate number of missing Tx completion
>  > interrupts (for some reason), even a diagnostic message won't be
>  > printed.  This is bad -- users will "seem" to have working but
>  > slow systems, without any indication of what causes this slowness.
> 
> It just reinvokes txeof handler and check whether there are pending Tx
> descriptors in driver queue. If there are no pending Tx descriptors
> it's false watchdog timeout and just return without resetting 
> hardware.
> 
This is all clear.

> So there is no performance drop.  Of course, if we are out of
> Tx descriptors and missed Tx completion interrupts it would slow down
> Tx process.
> 
Yes, that's what I was talking about.

> ATM I don't know what caused this missing Tx completion interrupt.
> (chipset bug/Tx interrupt moderation or other bug)
> 
>  > I think a diagnostic message should still be printed in this case,
> I have no objections on printing a diagnostic message. But if missing
> Tx completion interrupts is normal consequences for these hardwares
> it would give negative impresstion to users.
> 
It would tell the true, like

em0: watchdog timeout (missed Tx interrupt) -- recovering

(Maybe under bootverbose only.)

>  > and adapter->watchdog_events should still be incrementd, we just
>  > don't need to reinit the chip in this case.
>  > 
> adapter->watchdog_events is used to count output errors(if_oerrors).
> If we know the watchog timeout is false we should not increment the
> counter as we sucessfully transmitted it without errors.
> 
It's still a watchdog event.  We can make it a separate counter,
like watchdog_tx_event, and not add it to oerrors, but still show
it in em_print_hw_stats().  It'd be useful to have this statistics
available.

> Because it's hard to reproduce it I guess it only happens under
> certain conditions. In addition we don't know how many Tx completion
> interrupts are lost. If you think it should recover fast from the
> above condition wihtout waiting for a watchdog timeout we could
> embebd an em_txeof() into em_local_timer() to sweep up Tx
> descriptors sucessfully transmitted.
> 
That would make it look more like polling.  :-)

I'm pretty sure this problem is not unique to em(4).  Adding
these quirks to all known to be subject to this issue drivers
and gathering the statistics would be a good thing IMO.


Cheers,
-- 
Ruslan Ermilov
ru at FreeBSD.org
FreeBSD committer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/cvs-src/attachments/20060822/8e245d63/attachment.pgp


More information about the cvs-src mailing list