Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

Jason Wolfe nitroboost at gmail.com
Mon Nov 7 17:54:33 UTC 2011


On Mon, Oct 31, 2011 at 1:13 AM, Hooman Fazaeli <hoomanfazaeli at gmail.com>wrote:

>
> Attached is a patch for if_em.c. It flushes interface queue when it is
> full
> and link is not active. Please note that when this happens, drops are
> increasing
> on interface and this will trigger your scripts as before. You need to
> change
> a little the scripts as follows:
>
>               check interface TX status
>               if (interface TX seems hung) {
>                         sleep 5
>                         check interface TX status
>                         if (interface TX seems hung) {
>                              reset the interface.
>                         }
>                 }
>
> For MULTIQUEUE, it just disables the check for link status (which is not
> good).
> so pls. test in non-MULTIQUEUE mode.
>
> The patch also contains some minor fixups to compile on 7 plus
> a fix from r1.69 which addressed RX hang problem (the fix was
> later removed in r1.70). I included it for Emil to give it a
> try.
>
> Pls. let me know if you have any problems with patch.
>
>
Hooman,

Unfortunately  one of the server just had a wedge event a couple hours ago
with this patch.  To confirm your changes should cause a recovery within
the time I'm allowing, here is the current format:

check interface TX status
if (interface TX seems hung) {
    sleep 3
    check packets out
    sleep 2
    check packets out
    if (packets not incrementing) {
        reset the interface
    }
}

I bounced em0 because dropped packets incremented 1749543 to 1749708 and
the interface is not incrementing packets out.

4:10AM up 6 days, 15:23, 0 users, load averages: 0.02, 0.12, 0.14

em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
ether X
inet6 X%em0 prefixlen 64 scopeid 0x1
nd6 options=1<PERFORMNUD>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active

em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
ether X
inet6 X%em1 prefixlen 64 scopeid 0x2
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active

ipfw0: flags=8801<UP,SIMPLEX,MULTICAST> metric 0 mtu 65536

lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=3<RXCSUM,TXCSUM>
inet 127.0.0.1 netmask 0xff000000
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
inet X.X.X.X netmask 0xffffffff
inet X.X.X.X netmask 0xffffffff
inet X.X.X.X netmask 0xffffffff
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>

lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
ether X
inet X.X.X.X netmask 0xffffff00 broadcast X.X.X.X
inet6 X%lagg0 prefixlen 64 scopeid 0x5
inet6 X prefixlen 64 autoconf
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
media: Ethernet autoselect
status: active
laggproto loadbalance
laggport: em0 flags=4<ACTIVE>
laggport: em1 flags=4<ACTIVE>

interrupt total rate
irq3: uart1 3810 0
cpu0: timer 1147568087 2000
irq256: em0:rx 0 59779710 104
irq257: em0:tx 0 2771888652 4831
irq258: em0:link 1 0
irq259: em1:rx 0 3736828886 6512
irq260: em1:tx 0 2790566376 4863
irq261: em1:link 27286 0
irq262: mps0 395687386 689
cpu1: timer 1147559894 2000
cpu2: timer 1147559901 2000
cpu3: timer 1147559902 2000
Total 14345029891 25001

13466/4144/17610 mbufs in use (current/cache/total)
2567/2635/5202/5853720 mbuf clusters in use (current/cache/total/max)
2567/633 mbuf+clusters out of packet secondary zone in use (current/cache)
6798/554/7352/2926859 4k (page size) jumbo clusters in use
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
35692K/8522K/44214K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll Drop
em0 1500 <Link#1> 00:25:90:2b:e5:75 60747643 0 0 11246408092 0 0 1750763
em0 1500 fe80:1::225:9 fe80:1::225:90ff: 0 - - 4 - - -
em1 1500 <Link#2> 00:25:90:2b:e5:75 11237195776 123950 0 11344722383 0 0
545682
em1 1500 fe80:2::225:9 fe80:2::225:90ff: 0 - - 1 - - -
lagg0 1500 <Link#5> 00:25:90:2b:e5:75 11297850142 0 0 22588666102 2296445 0
0
lagg0 1500 69.164.38.0/2 69.164.38.83 10189108030 - - 22592881776 - - -
lagg0 1500 fe80:5::225:9 fe80:5::225:90ff: 24 - - 28 - - -
lagg0 1500 2607:f4e8:310 2607:f4e8:310:12: 19578 - - 19591 - - -

kern.msgbuf:

Nov 7 04:10:06 cds1033 kernel: Interface is RUNNING and INACTIVE
Nov 7 04:10:06 cds1033 kernel: em0: hw tdh = 558, hw tdt = 558
Nov 7 04:10:06 cds1033 kernel: em0: hw rdh = 889, hw rdt = 888
Nov 7 04:10:06 cds1033 kernel: em0: Tx Queue Status = 0
Nov 7 04:10:06 cds1033 kernel: em0: TX descriptors avail = 1024
Nov 7 04:10:06 cds1033 kernel: em0: Tx Descriptors avail failure = 0
Nov 7 04:10:06 cds1033 kernel: em0: RX discarded packets = 0
Nov 7 04:10:06 cds1033 kernel: em0: RX Next to Check = 889
Nov 7 04:10:06 cds1033 kernel: em0: RX Next to Refresh = 888
Nov 7 04:10:06 cds1033 kernel: em0: Link state: active
Nov 7 04:10:06 cds1033 kernel: Interface is RUNNING and INACTIVE
Nov 7 04:10:06 cds1033 kernel: em1: hw tdh = 837, hw tdt = 837
Nov 7 04:10:06 cds1033 kernel: em1: hw rdh = 198, hw rdt = 189
Nov 7 04:10:06 cds1033 kernel: em1: Tx Queue Status = 0
Nov 7 04:10:06 cds1033 kernel: em1: TX descriptors avail = 1021
Nov 7 04:10:06 cds1033 kernel: em1: Tx Descriptors avail failure = 0
Nov 7 04:10:06 cds1033 kernel: em1: RX discarded packets = 0
Nov 7 04:10:06 cds1033 kernel: em1: RX Next to Check = 303
Nov 7 04:10:06 cds1033 kernel: em1: RX Next to Refresh = 302
Nov 7 04:10:06 cds1033 kernel: em1: Link state: active

net.inet.ip.intr_queue_maxlen: 512
net.inet.ip.intr_queue_drops: 0
dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.0.%driver: em
dev.em.0.%location: slot=0 function=0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x15d9
subdevice=0x10d3 class=0x020000
dev.em.0.%parent: pci1
dev.em.0.nvm: -1
dev.em.0.debug: -1
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 66
dev.em.0.tx_abs_int_delay: 66
dev.em.0.rx_processing_limit: 100
dev.em.0.flow_control: 0
dev.em.0.eee_control: 0
dev.em.0.link_irq: 1
dev.em.0.mbuf_alloc_fail: 0
dev.em.0.cluster_alloc_fail: 0
dev.em.0.dropped: 0
dev.em.0.tx_dma_fail: 0
dev.em.0.rx_overruns: 0
dev.em.0.watchdog_timeouts: 0
dev.em.0.link_toggles: 3
dev.em.0.device_control: 1074790984
dev.em.0.rx_control: 67141634
dev.em.0.fc_high_water: 18432
dev.em.0.fc_low_water: 16932
dev.em.0.queue0.txd_head: 558
dev.em.0.queue0.txd_tail: 558
dev.em.0.queue0.tx_irq: 2771888613
dev.em.0.queue0.no_desc_avail: 0
dev.em.0.queue0.rxd_head: 102
dev.em.0.queue0.rxd_tail: 101
dev.em.0.queue0.rx_irq: 59779941
dev.em.0.mac_stats.excess_coll: 0
dev.em.0.mac_stats.single_coll: 0
dev.em.0.mac_stats.multiple_coll: 0
dev.em.0.mac_stats.late_coll: 0
dev.em.0.mac_stats.collision_count: 0
dev.em.0.mac_stats.symbol_errors: 0
dev.em.0.mac_stats.sequence_errors: 0
dev.em.0.mac_stats.defer_count: 0
dev.em.0.mac_stats.missed_packets: 0
dev.em.0.mac_stats.recv_no_buff: 0
dev.em.0.mac_stats.recv_undersize: 0
dev.em.0.mac_stats.recv_fragmented: 0
dev.em.0.mac_stats.recv_oversize: 0
dev.em.0.mac_stats.recv_jabber: 0
dev.em.0.mac_stats.recv_errs: 0
dev.em.0.mac_stats.crc_errs: 0
dev.em.0.mac_stats.alignment_errs: 0
dev.em.0.mac_stats.coll_ext_errs: 0
dev.em.0.mac_stats.xon_recvd: 0
dev.em.0.mac_stats.xon_txd: 0
dev.em.0.mac_stats.xoff_recvd: 0
dev.em.0.mac_stats.xoff_txd: 0
dev.em.0.mac_stats.total_pkts_recvd: 60768005
dev.em.0.mac_stats.good_pkts_recvd: 60768005
dev.em.0.mac_stats.bcast_pkts_recvd: 60744833
dev.em.0.mac_stats.mcast_pkts_recvd: 2948
dev.em.0.mac_stats.rx_frames_64: 60744822
dev.em.0.mac_stats.rx_frames_65_127: 3491
dev.em.0.mac_stats.rx_frames_128_255: 19293
dev.em.0.mac_stats.rx_frames_256_511: 364
dev.em.0.mac_stats.rx_frames_512_1023: 35
dev.em.0.mac_stats.rx_frames_1024_1522: 0
dev.em.0.mac_stats.good_octets_recvd: 3890774076
dev.em.0.mac_stats.good_octets_txd: 15295538422911
dev.em.0.mac_stats.total_pkts_txd: 11246435226
dev.em.0.mac_stats.good_pkts_txd: 11246435226
dev.em.0.mac_stats.bcast_pkts_txd: 4
dev.em.0.mac_stats.mcast_pkts_txd: 3830
dev.em.0.mac_stats.tx_frames_64: 60338532
dev.em.0.mac_stats.tx_frames_65_127: 897135811
dev.em.0.mac_stats.tx_frames_128_255: 13933787
dev.em.0.mac_stats.tx_frames_256_511: 22107540
dev.em.0.mac_stats.tx_frames_512_1023: 116374045
dev.em.0.mac_stats.tx_frames_1024_1522: 10136545511
dev.em.0.mac_stats.tso_txd: 0
dev.em.0.mac_stats.tso_ctx_fail: 0
dev.em.0.interrupts.asserts: 3
dev.em.0.interrupts.rx_pkt_timer: 0
dev.em.0.interrupts.rx_abs_timer: 0
dev.em.0.interrupts.tx_pkt_timer: 0
dev.em.0.interrupts.tx_abs_timer: 0
dev.em.0.interrupts.tx_queue_empty: 0
dev.em.0.interrupts.tx_queue_min_thresh: 0
dev.em.0.interrupts.rx_desc_min_thresh: 0
dev.em.0.interrupts.rx_overrun: 0
dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.1.%driver: em
dev.em.1.%location: slot=0 function=0
dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x15d9
subdevice=0x10d3 class=0x020000
dev.em.1.%parent: pci2
dev.em.1.nvm: -1
dev.em.1.debug: -1
dev.em.1.rx_int_delay: 0
dev.em.1.tx_int_delay: 66
dev.em.1.rx_abs_int_delay: 66
dev.em.1.tx_abs_int_delay: 66
dev.em.1.rx_processing_limit: 100
dev.em.1.flow_control: 0
dev.em.1.eee_control: 0
dev.em.1.link_irq: 27207
dev.em.1.mbuf_alloc_fail: 0
dev.em.1.cluster_alloc_fail: 0
dev.em.1.dropped: 0
dev.em.1.tx_dma_fail: 0
dev.em.1.rx_overruns: 0
dev.em.1.watchdog_timeouts: 0
dev.em.1.link_toggles: 3
dev.em.1.device_control: 1074790984
dev.em.1.rx_control: 67141634
dev.em.1.fc_high_water: 18432
dev.em.1.fc_low_water: 16932
dev.em.1.queue0.txd_head: 804
dev.em.1.queue0.txd_tail: 804
dev.em.1.queue0.tx_irq: 2790571888
dev.em.1.queue0.no_desc_avail: 0
dev.em.1.queue0.rxd_head: 686
dev.em.1.queue0.rxd_tail: 686
dev.em.1.queue0.rx_irq: 3729597000
dev.em.1.mac_stats.excess_coll: 0
dev.em.1.mac_stats.single_coll: 0
dev.em.1.mac_stats.multiple_coll: 0
dev.em.1.mac_stats.late_coll: 0
dev.em.1.mac_stats.collision_count: 0
dev.em.1.mac_stats.symbol_errors: 0
dev.em.1.mac_stats.sequence_errors: 0
dev.em.1.mac_stats.defer_count: 0
dev.em.1.mac_stats.missed_packets: 123950
dev.em.1.mac_stats.recv_no_buff: 7063
dev.em.1.mac_stats.recv_undersize: 0
dev.em.1.mac_stats.recv_fragmented: 0
dev.em.1.mac_stats.recv_oversize: 0
dev.em.1.mac_stats.recv_jabber: 0
dev.em.1.mac_stats.recv_errs: 0
dev.em.1.mac_stats.crc_errs: 0
dev.em.1.mac_stats.alignment_errs: 0
dev.em.1.mac_stats.coll_ext_errs: 0
dev.em.1.mac_stats.xon_recvd: 0
dev.em.1.mac_stats.xon_txd: 0
dev.em.1.mac_stats.xoff_recvd: 0
dev.em.1.mac_stats.xoff_txd: 0
dev.em.1.mac_stats.total_pkts_recvd: 11237327404
dev.em.1.mac_stats.good_pkts_recvd: 11237203454
dev.em.1.mac_stats.bcast_pkts_recvd: 60741559
dev.em.1.mac_stats.mcast_pkts_recvd: 2935
dev.em.1.mac_stats.rx_frames_64: 4479481614
dev.em.1.mac_stats.rx_frames_65_127: 5124633522
dev.em.1.mac_stats.rx_frames_128_255: 13228214
dev.em.1.mac_stats.rx_frames_256_511: 23321029
dev.em.1.mac_stats.rx_frames_512_1023: 74960116
dev.em.1.mac_stats.rx_frames_1024_1522: 1521578959
dev.em.1.mac_stats.good_octets_recvd: 3024997755633
dev.em.1.mac_stats.good_octets_txd: 15349574964083
dev.em.1.mac_stats.total_pkts_txd: 11344737635
dev.em.1.mac_stats.good_pkts_txd: 11344737635
dev.em.1.mac_stats.bcast_pkts_txd: 2501
dev.em.1.mac_stats.mcast_pkts_txd: 8
dev.em.1.mac_stats.tx_frames_64: 62839963
dev.em.1.mac_stats.tx_frames_65_127: 962219224
dev.em.1.mac_stats.tx_frames_128_255: 14620849
dev.em.1.mac_stats.tx_frames_256_511: 20410906
dev.em.1.mac_stats.tx_frames_512_1023: 117272600
dev.em.1.mac_stats.tx_frames_1024_1522: 10167374093
dev.em.1.mac_stats.tso_txd: 0
dev.em.1.mac_stats.tso_ctx_fail: 0
dev.em.1.interrupts.asserts: 19749
dev.em.1.interrupts.rx_pkt_timer: 1
dev.em.1.interrupts.rx_abs_timer: 0
dev.em.1.interrupts.tx_pkt_timer: 0
dev.em.1.interrupts.tx_abs_timer: 0
dev.em.1.interrupts.tx_queue_empty: 0
dev.em.1.interrupts.tx_queue_min_thresh: 0
dev.em.1.interrupts.rx_desc_min_thresh: 0
dev.em.1.interrupts.rx_overrun: 18
hw.em.rx_hang_fixup: 0
hw.em.eee_setting: 0
hw.em.fc_setting: 0
hw.em.rx_process_limit: 100
hw.em.enable_msix: 1
hw.em.sbp: 0
hw.em.smart_pwr_down: 0
hw.em.txd: 1024
hw.em.rxd: 1024
hw.em.rx_abs_int_delay: 66
hw.em.tx_abs_int_delay: 66
hw.em.rx_int_delay: 0
hw.em.tx_int_delay: 66

Jason


More information about the freebsd-net mailing list