Intermittent connectivity loss with em(4)

Doug Hardie bc979 at lafn.org
Sun Oct 6 20:50:17 UTC 2019


> On 6 October 2019, at 12:34, Morgan Wesström <freebsd-database at pp.dyndns.biz> wrote:
> 
> I'd appreciate some help with the following problem because I'm probably blind to the obvious from all my trouble shooting.
> 
> I have a small Supermicro Atom-based ITX-board with 3x Intel 82574L NICs (2 on mainboard, 1 in PCIe slot) and I experience intermittent network connectivity loss every few minutes. The machine is currently running FreeBSD 12.0-RELEASE-p10. I loop a single ping once per minute to dns.google (8.8.8.8) and output looks like this:
> 
> THINGS I'VE TESTED WITHOUT RESOLVING THE PROBLEM
> 
> - Tried all three NICs in the computer but they all show the same behaviour.
> - Tried a different ping target.
> - Tried the LiveCD environment from the older FreeBSD 11.3-RELEASE memstick.
> - Disabled MSI-X interrupts.
> - Additionally disabled MSI interrupts.
> - Recompiled em(4) and enabled DEBUG_INIT, DEBUG_IOCTL and DEBUG_HW but this only generates a few more message in dmesg during boot. There is nothing shown in dmesg or otherwise when connectivity is lost.
> 
> THING'S I'VE TRIED THAT SEEMINGLY RESOLVES OR ALLEVIATES THE PROBLEM
> 
> - Booting the system on Linux (4.19 kernel) with just a simple command prompt and running the same ping test does _NOT_ show any connectivity loss. At least not during the hour or so I tested. To me this rules out any hardware related problems as well as the network connection itself.
> - Generating small amounts of network traffic on the interface (like an ssh session) seems to reduce the problem. I can then run the ping test for maybe 30 minutes without loss of connectivity in FreeBSD but eventually it fails too.
> 
> I really need help to understand what's going on here. I have a gut feeling some power saving is playing a trick on me but hw.em.smart_pwr_down is set to 0 default and I have no indication of any power saving function kicking in. How can I debug what's going on in em(4)? Am I just stupid and missing something obvious here?

I just tried the same thing using 12.0-RELEASE-p9.  No drops.

mail# ping -i 60 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: icmp_seq=0 ttl=54 time=10.528 ms
64 bytes from 8.8.8.8: icmp_seq=1 ttl=54 time=11.403 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=54 time=10.351 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=54 time=10.787 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=54 time=12.280 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=54 time=10.038 ms
64 bytes from 8.8.8.8: icmp_seq=6 ttl=54 time=10.640 ms
64 bytes from 8.8.8.8: icmp_seq=7 ttl=54 time=11.090 ms
64 bytes from 8.8.8.8: icmp_seq=8 ttl=54 time=10.615 ms
64 bytes from 8.8.8.8: icmp_seq=9 ttl=54 time=8.972 ms
64 bytes from 8.8.8.8: icmp_seq=10 ttl=54 time=10.016 ms
64 bytes from 8.8.8.8: icmp_seq=11 ttl=54 time=11.097 ms
64 bytes from 8.8.8.8: icmp_seq=12 ttl=54 time=13.035 ms
64 bytes from 8.8.8.8: icmp_seq=13 ttl=54 time=11.251 ms
64 bytes from 8.8.8.8: icmp_seq=14 ttl=54 time=9.588 ms
64 bytes from 8.8.8.8: icmp_seq=15 ttl=54 time=10.649 ms
64 bytes from 8.8.8.8: icmp_seq=16 ttl=54 time=9.965 ms
64 bytes from 8.8.8.8: icmp_seq=17 ttl=54 time=9.900 ms
64 bytes from 8.8.8.8: icmp_seq=18 ttl=54 time=11.253 ms
64 bytes from 8.8.8.8: icmp_seq=19 ttl=54 time=9.440 ms
64 bytes from 8.8.8.8: icmp_seq=20 ttl=54 time=11.544 ms
64 bytes from 8.8.8.8: icmp_seq=21 ttl=54 time=11.068 ms
64 bytes from 8.8.8.8: icmp_seq=22 ttl=54 time=10.196 ms
64 bytes from 8.8.8.8: icmp_seq=23 ttl=54 time=12.446 ms
64 bytes from 8.8.8.8: icmp_seq=24 ttl=54 time=10.467 ms
64 bytes from 8.8.8.8: icmp_seq=25 ttl=54 time=11.100 ms
64 bytes from 8.8.8.8: icmp_seq=26 ttl=54 time=9.746 ms
64 bytes from 8.8.8.8: icmp_seq=27 ttl=54 time=10.627 ms
64 bytes from 8.8.8.8: icmp_seq=28 ttl=54 time=10.339 ms
64 bytes from 8.8.8.8: icmp_seq=29 ttl=54 time=10.166 ms
64 bytes from 8.8.8.8: icmp_seq=30 ttl=54 time=10.055 ms
64 bytes from 8.8.8.8: icmp_seq=31 ttl=54 time=9.303 ms
64 bytes from 8.8.8.8: icmp_seq=32 ttl=54 time=11.240 ms
64 bytes from 8.8.8.8: icmp_seq=33 ttl=54 time=11.407 ms
64 bytes from 8.8.8.8: icmp_seq=34 ttl=54 time=10.930 ms
64 bytes from 8.8.8.8: icmp_seq=35 ttl=54 time=10.166 ms
64 bytes from 8.8.8.8: icmp_seq=36 ttl=54 time=11.400 ms
64 bytes from 8.8.8.8: icmp_seq=37 ttl=54 time=10.946 ms
^C
--- 8.8.8.8 ping statistics ---
38 packets transmitted, 38 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 8.972/10.685/13.035/0.846 ms

I have had issues with em nics in the past and don't use them anymore.  However, none of the issues were as blatant as you are seeing.  The dmsg log is only written to during boot.  Messages during operation are written to /var/log/messages.  Check there to see if there is anything that correlates with the outages.

-- Doug



More information about the freebsd-questions mailing list