call for bge(4) testers
Wilkinson, Alex
alex.wilkinson at dsto.defence.gov.au
Thu Aug 24 00:18:32 UTC 2006
0n Wed, Aug 23, 2006 at 02:04:20PM +0400, Gleb Smirnoff wrote:
>On Wed, Aug 23, 2006 at 06:55:04PM +0900, Pyun YongHyeon wrote:
>P> On Wed, Aug 23, 2006 at 01:37:41PM +0400, Gleb Smirnoff wrote:
>P> > On Tue, Aug 22, 2006 at 01:20:23PM +0900, Pyun YongHyeon wrote:
>P> > P> After fixing em(4) watchdog bug, I looked over bge(4) and I think
>P> > P> bge(4) may suffer from the same issue. So if you have seen occasional
>P> > P> watchdog timeout errors on bge(4) please give the attached patch a try.
>P> > P> The patch does fix false watchdog timeout error only.
>P> > P> Typical pheonoma for false watchdog timeout error are
>P> > P> o polling(4) fix the issue
>P> > P> o random watchdog error
>P> > P>
>P> > P> If my patch fix the issue you could see the following messages.
>P> > P> "missing Tx completion interrupt!" or "link lost -- resetting"
>P> >
>P> > I still think that this fix is incorrect. It is just a more gentle
>P> > recovery from a fake watchdog timeout.
>P>
>P> Its sole purpose is to reinitialize hardware for real watchdog
>P> timeouts. It's not fix for general watchdog timeouts. As I said other
>P> mails, the fake watchdog timeout(losing Tx interrupts) for hardwares
>P> with Tx interrupt moderation capability could be normal thing. So I
>P> just want to know bge(4) also has the same feature(bug).
>
>According to several emails about em(4) fake watchdog timeouts, the
>problem can be fixed by setting debug.mpsafenet=0. This makes me think
>that the problem isn't caused by TX interrupt moderation, but some race
>in the kernel. Really, if_slowtimo() doesn't acquire driver lock before
>checking and modifying the if_timer field.
>
>Afaik, NIC drivers that can do interrupt moderation should set a timer
>to a sane value, based on interrupt moderation settings, so that the
>watchdog won't be ever called fakely.
What is interrupt moderation ?
-aW
More information about the freebsd-current
mailing list