em card wedging
Ivan Voras
ivoras at freebsd.org
Fri Nov 19 22:01:44 UTC 2010
This problem is separate, on a separate system, from those I've been
reporting the last few days, just in case someone read them all :)
An on-board em card in a server (supermicro motherboard) wedges after a
couple of
minutes of operation and while there are continuous "watchdog timeout"
messages on the console, it doesn't help the card and it stays wedged
forever. When this problem happens, monitoring the network state with
"netstat 1" suddenly starts outputing garbage values (large 64-bit
numbers, always constant) for incoming and outgoing packet counts,
like there is some kind of kernel memory corruption.
This can be quickly provoked on-demand by doing flood-ping (ping -f).
There are two ports to the card, em0 and em1 and if I transfer the
Ethernet cable from em0 to em1 and bring it up, then *both* cards
indicate in ifconfig status that they have signal (active) but after a
few packets exchanged over em1 (DHCP) it also hangs.
This is 8-stable amd64 (the behaviour was much worse on 8.0-release
and 8.1-release - the card stopped working after a few seconds) with
this hardware:
em0: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0xdc00-0xdc1f
mem 0xfb5e0000-0xfb5fffff,0xfb5dc000-0xfb5dffff irq 16 at device 0.0
on pci3
em0: Using MSI interrupt
em0: [FILTER]
em0: Ethernet address: 00:25:90:0b:77:5c
em1: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0xec00-0xec1f
mem 0xfb6e0000-0xfb6fffff,0xfb6dc000-0xfb6dffff irq 17 at device 0.0
on pci4
em1: Using MSI interrupt
em1: [FILTER]
em1: Ethernet address: 00:25:90:0b:77:5d
em0 at pci0:3:0:0: class=0x020000 card=0x040d15d9 chip=0x10d38086 rev=0x00
hdr=0x00
vendor = 'Intel Corporation'
device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
class = network
subclass = ethernet
bar [10] = type Memory, range 32, base 0xfb5e0000, size 131072,
enabled
bar [18] = type I/O Port, range 32, base 0xdc00, size 32, enabled
bar [1c] = type Memory, range 32, base 0xfb5dc000, size 16384, enabled
cap 01[c8] = powerspec 2 supports D0 D3 current D0
cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
cap 11[a0] = MSI-X supports 5 messages in map 0x1c
em1 at pci0:4:0:0: class=0x020000 card=0x040d15d9 chip=0x10d38086 rev=0x00
hdr=0x00
vendor = 'Intel Corporation'
device = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
class = network
subclass = ethernet
bar [10] = type Memory, range 32, base 0xfb6e0000, size 131072,
enabled
bar [18] = type I/O Port, range 32, base 0xec00, size 32, enabled
bar [1c] = type Memory, range 32, base 0xfb6dc000, size 16384, enabled
cap 01[c8] = powerspec 2 supports D0 D3 current D0
cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
cap 11[a0] = MSI-X supports 5 messages in map 0x1c
Interestingly, IPMI, which also works over the same port (and is in
fact on the same subnet as the "main" port) continues working
while all this is happening.
The BIOS configuration doesn't contain anything directly connected to
advanced NIC settings but it contains several PCI-E settings, if there
is a chance toggling them will work.
While the card is wedged like this, the server cannot be shutdown or
restarted by software - the whole machine hangs after flushing vnodes
& buffers and has to be cold-cycled.
More information about the freebsd-hardware
mailing list