10.1-STABLE bce: Watchdog timeout occurred
pyunyh at gmail.com
Wed Apr 22 05:11:20 UTC 2015
On Wed, Apr 22, 2015 at 12:39:16AM -0400, Chris Ross wrote:
> On Apr 21, 2015, at 10:10 , Gareth Wyn Roberts <g.w.roberts at glyndwr.ac.uk> wrote:
> > This may be caused by DMA alignment problems.
> > See https://docs.freebsd.org/cgi/getmsg.cgi?fetch=145859+0+archive/2015/freebsd-stable/20150419.freebsd-stable for a recent thread about the msk driver. The msk maintainer Yonghyeon Pyun has opted for super safe options of 32K alignment!
> > It's a long shot, but you could try increasing BCE_DMA_ALIGN and/or BCE_RX_BUF_ALIGN in the include file if_bcereg.h, say up to 4096, to see whether it makes any difference.
> Well, after making that change, I was able to confirm that the problem doesn't seem to occur. However, in trying to verify the problem on an unmodified kernel, I've rebooted a GENERIC from r281672 without that change, and am also not seeing the problem. :-/ I'm not sure whether the gremlins have "fixed" something, or if I was just too critical in my initial analysis.
> For now I'll take that change out of my tree and run without it. If I see the flapping again, I'll confirm that it's repeatable, then change the alignments as suggested and see if I see a change.
I guess the alignment issue of msk(4) has nothing to do with bce(4)
watchdog timeouts. It would be more helpful to know details of
your controller(bce(4)/brgphy(4) related dmesg output, pciconf
output etc) and network setup.
If you know a reliable way that triggers the watchdog timeouts,
please share that info too. I would have tried to disable all
hardware offloading features(TSO, checksum, VLAN H/W tagging etc)
and see whether that makes any differences in the first step to
narrow down the issue.
More information about the freebsd-stable