[Bug 220997] Broken watchdog after iflib update (em0)

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Tue Jul 25 09:18:25 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=220997

            Bug ID: 220997
           Summary: Broken watchdog after iflib update (em0)
           Product: Base System
           Version: CURRENT
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs at FreeBSD.org
          Reporter: netchild at FreeBSD.org

Hi,

em0 at pci0:2:6:0: class=0x020000 card=0x13768086 chip=0x107c8086 rev=0x05
hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82541PI Gigabit Ethernet Controller'
    class      = network
    subclass   = ethernet

head as of r321365

Before the iflib update, this device showed several watchdog and NIC resets per
day. After the iflib update the system loses network connection from time to
time and only a reboot helps getting back the network connection.

The system has several VIMAGE jails.

I have no second way to access the system, so any debugging would need to
happen in a scripted way. My current workaround is this (most of the time it
reboots, sometimes the ifconfig down/up helps to get back a network
connection):
---snip---
#!/bin/sh

GW_IP=a.b.c.d

(echo uptime:; /usr/bin/uptime; echo; /usr/bin/netstat -m; echo; echo dmesg:;
/sbin/dmesg | /usr/bin/tail -50; echo ping:; /sbin/ping -nc 1 $GW_IP; ifconfig
em0 down; sleep 1; ifconfig em0 up; echo; echo ping again after ifconfig
down/up; /sbin/ping -nc 1 $GW_IP; ) | /usr/bin/mail -s "$(hostname): no
gateway" root
/bin/sleep 60
/sbin/ping -nc 1 $GW_IP >/dev/null 2>&1 || /sbin/shutdown -r now >/dev/null
2>&1
---snip---

Here an example of the output when it needs to reboot:
---snip---
uptime:
 5:45PM  up  9:30, 1 users, load averages: 0.14, 0.25, 0.58

1283/3277/4560 mbufs in use (current/cache/total)
1024/1818/2842/500678 mbuf clusters in use (current/cache/total/max)
0/1518 mbuf+clusters out of packet secondary zone in use (current/cache)
0/40/40/250338 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/74174 9k jumbo clusters in use (current/cache/total/max)
0/0/0/41723 16k jumbo clusters in use (current/cache/total/max)
2368K/4615K/6984K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
14237 sendfile syscalls
819 sendfile syscalls completed without I/O request
14426 requests for I/O initiated by sendfile
80695 pages read by sendfile as part of a request
11663 pages were valid at time of a sendfile request
0 pages were valid and substituted to bogus page
0 pages were requested for read ahead by applications
0 pages were read ahead by sendfile
0 times sendfile encountered an already busy page
0 requests for sfbufs denied
0 requests for sfbufs delayed

dmesg:
epair3a: Ethernet address: 02:ff:80:00:09:0a
epair3b: Ethernet address: 02:ff:d0:00:0a:0b
epair3a: link state changed to UP
epair3b: link state changed to UP
epair3a: changing name to 'vnet1:2'
epair3b: changing name to 'vnet1'
vnet1:2: promiscuous mode enabled
epair4a: Ethernet address: 02:ff:80:00:0a:0a
epair4b: Ethernet address: 02:ff:d0:00:0b:0b
epair4a: link state changed to UP
epair4b: link state changed to UP
epair4a: changing name to 'vnet0:3'
epair4b: changing name to 'vnet0'
vnet0:3: promiscuous mode enabled
epair5a: Ethernet address: 02:ff:80:00:0b:0a
epair5b: Ethernet address: 02:ff:d0:00:0c:0b
epair5a: link state changed to UP
epair5b: link state changed to UP
epair5a: changing name to 'vnet1:3'
epair5b: changing name to 'vnet1'
vnet1:3: promiscuous mode enabled
epair6a: Ethernet address: 02:ff:80:00:0c:0a
epair6b: Ethernet address: 02:ff:d0:00:0d:0b
epair6a: link state changed to UP
epair6b: link state changed to UP
epair6a: changing name to 'vnet0:4'
epair6b: changing name to 'vnet0'
vnet0:4: promiscuous mode enabled
epair7a: Ethernet address: 02:ff:80:00:0d:0a
epair7b: Ethernet address: 02:ff:d0:00:0e:0b
epair7a: link state changed to UP
epair7b: link state changed to UP
epair7a: changing name to 'vnet1:4'
epair7b: changing name to 'vnet1'
vnet1:4: promiscuous mode enabled
epair8a: Ethernet address: 02:ff:80:00:0e:0a
epair8b: Ethernet address: 02:ff:d0:00:0f:0b
epair8a: link state changed to UP
epair8b: link state changed to UP
epair8a: changing name to 'vnet0:5'
epair8b: changing name to 'vnet0'
vnet0:5: promiscuous mode enabled
epair9a: Ethernet address: 02:ff:80:00:0f:0a
epair9b: Ethernet address: 02:ff:d0:00:10:0b
epair9a: link state changed to UP
epair9b: link state changed to UP
epair9a: changing name to 'vnet1:5'
epair9b: changing name to 'vnet1'
vnet1:5: promiscuous mode enabled
sonewconn: pcb 0xfffff8009e728848: Listen queue overflow: 5 already in queue
awaiting acceptance (1 occurrences)
ping:
PING $GW_IP ($GW_IP): 56 data bytes

--- $GW_IP ping statistics ---
1 packets transmitted, 0 packets received, 100.0% packet loss

ping again after ifconfig down/up
PING $GW_IP ($GW_IP): 56 data bytes
64 bytes from $GW_IP: icmp_seq=0 ttl=64 time=6457.660 ms

--- $GW_IP ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 6457.660/6457.660/6457.660/0.000 ms
---snip---

Bye,
Alexander.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list