Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled

Jason Wolfe nitroboost at gmail.com
Mon Oct 31 04:03:06 UTC 2011


On Sun, Oct 30, 2011 at 1:57 AM, Hooman Fazaeli <hoomanfazaeli at gmail.com>wrote:

>
> I finally managed to re-produce an affect similar to Jason's case. It
> may not be the exact same issue, but it is a serious problem and must
> be addressed.
>
> 1. Push out packet on em/igb with high rate.
> 2. Disconnect cable and wait for a few seconds. "netstat -ind" shows that
>   Drops are increasing.
> 3. Re-connect the cable. Both sides of like re-negotiate and the links
> comes up.
> 4. But ..., no packets is ever transmitted again and Drops still
> increasing!
>
> This is because em/lem/igb and some other interfaces (i.e., bce) have
> a check at the very beginning of their _start function
> which checks link status and immediately returns if it is inactive.
> This behavior causes if_snd to fills up in step 2 and when this happens,
> IFQ_HANDOFF never calls if_start again, even when the link becomes
> active again.
>
> A cable unplug is not necessary to trigger the issue. Any temporary
> link loss (e.i., during re-negotiation) can potentially lead to
> aforementioned problem.
>
> IMHO, this is not a driver issue and the real fix would be to change
> IFQ_HANDOFF to call if_start when the queue is full.
>
> Jason, If you are interested, I can prepare a patch for you
>  to address this issue in if_em and see if it helps.
>

Hooman,

Thanks for looking into this.  I'd be happy to test any patch thrown my
way, but keep in mind my issue is only tickled when MSI-X is enabled.  My
interfaces aren't bouncing, though it might be possible some unique path in
the MSI-X code is causing a throughput hang akin to connectivity loss?

Jack is the delta your speaking to the 7.2.4 code?  I did manage to get the
code from Intel compiled with a couple minutes of work, but haven't loaded
it up yet as I didn't see anything that caught my untrained eye in the
diffs.  I'll wait until its ported over and would be happy to test if
needed.

Conveniently enough I just received another report from my test boxes with
a pretty stock loader.conf.  I had forgotten to remove the advanced options
from the interfaces after I cycled them to pick up the fc_setting=0.  Fixed
that up just meow.

hw.em.fc_setting="0"
cc_cubic_load="YES"

I bounced em0 because dropped packets incremented 368756 to 369124 and the
interface is not incrementing packets out.

5:35PM up 2 days, 17:45, 0 users, load averages: 0.34, 0.45, 0.48

em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
ether X
inet6 X%em0 prefixlen 64 scopeid 0x1
nd6 options=1<PERFORMNUD>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active

em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
ether X
inet6 X%em1 prefixlen 64 scopeid 0x2
inet6 X prefixlen 64 autoconf
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active

ipfw0: flags=8801<UP,SIMPLEX,MULTICAST> metric 0 mtu 65536

lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=3<RXCSUM,TXCSUM>
inet 127.0.0.1 netmask 0xff000000
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
inet X.X.X.X netmask 0xffffffff
inet X.X.X.X netmask 0xffffffff
inet X.X.X.X netmask 0xffffffff
inet X.X.X.X netmask 0xffffffff
inet X.X.X.X netmask 0xffffffff
inet X.X.X.X netmask 0xffffffff
inet X.X.X.X netmask 0xffffffff
inet X.X.X.X netmask 0xffffffff
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>

lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
ether X
inet X.X.X.X netmask 0xffffff00 broadcast X.X.X.X
inet6 X%lagg0 prefixlen 64 scopeid 0x5
nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
media: Ethernet autoselect
status: active
laggproto loadbalance
laggport: em0 flags=4<ACTIVE>
laggport: em1 flags=4<ACTIVE>

interrupt total rate
irq3: uart1 3456 0
cpu0: timer 473404250 2000
irq256: em0:rx 0 24614350 103
irq257: em0:tx 0 1220810972 5157
irq258: em0:link 1 0
irq259: em1:rx 0 1533295149 6477
irq260: em1:tx 0 1194032538 5044
irq261: em1:link 3272 0
irq262: mps0 189602667 801
cpu3: timer 473396089 2000
cpu1: timer 473396089 2000
cpu2: timer 473396081 2000
Total 6055954914 25585

32999/8476/41475 mbufs in use (current/cache/total)
4064/3398/7462/5872038 mbuf clusters in use (current/cache/total/max)
4064/800 mbuf+clusters out of packet secondary zone in use (current/cache)
24900/669/25569/2936019 4k (page size) jumbo clusters in use
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
115977K/11591K/127568K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
61 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll Drop
em0 1500 <Link#1> 00:25:90:2a:a2:d7 24946787 0 0 5734180355 0 0 369844
em0 1500 fe80:1::225:9 fe80:1::225:90ff: 0 - - 2 - - -
em1 1500 <Link#2> 00:25:90:2a:a2:d7 5220869518 15996 0 5429971995 0 0 37009
em1 1500 fe80:2::225:9 fe80:2::225:90ff: 0 - - 1 - - -
em1 1500 2607:f4e8:310 2607:f4e8:310:12: 0 - - 0 - - -
lagg0 1500 <Link#5> 00:25:90:2a:a2:d7 5245767782 0 0 11162877037 406853 0 0
lagg0 1500 69.164.38.0/2 69.164.38.69 4776881809 - - 11164303625 - - -
lagg0 1500 fe80:5::225:9 fe80:5::225:90ff: 0 - - 3 - - -

kern.msgbuf:

Oct 30 17:08:38 cds1019 kernel: ifa_add_loopback_route: insertion failed
Oct 30 17:12:10 cds1019 kernel: ifa_add_loopback_route: insertion failed
Oct 30 17:20:20 cds1019 last message repeated 3 times
Oct 30 17:32:13 cds1019 last message repeated 4 times
Oct 30 17:34:27 cds1019 kernel: ifa_add_loopback_route: insertion failed
Oct 30 17:35:03 cds1019 kernel: Interface is RUNNING and INACTIVE
Oct 30 17:35:03 cds1019 kernel: em0: hw tdh = 818, hw tdt = 818
Oct 30 17:35:03 cds1019 kernel: em0: hw rdh = 99, hw rdt = 98
Oct 30 17:35:03 cds1019 kernel: em0: Tx Queue Status = 0
Oct 30 17:35:03 cds1019 kernel: em0: TX descriptors avail = 1024
Oct 30 17:35:03 cds1019 kernel: em0: Tx Descriptors avail failure = 0
Oct 30 17:35:03 cds1019 kernel: em0: RX discarded packets = 0
Oct 30 17:35:03 cds1019 kernel: em0: RX Next to Check = 110
Oct 30 17:35:03 cds1019 kernel: em0: RX Next to Refresh = 109
Oct 30 17:35:03 cds1019 kernel: em0: Link state: active
Oct 30 17:35:03 cds1019 kernel: Interface is RUNNING and INACTIVE
Oct 30 17:35:03 cds1019 kernel: em1: hw tdh = 688, hw tdt = 688
Oct 30 17:35:03 cds1019 kernel: em1: hw rdh = 861, hw rdt = 860
Oct 30 17:35:03 cds1019 kernel: em1: Tx Queue Status = 1
Oct 30 17:35:03 cds1019 kernel: em1: TX descriptors avail = 1024
Oct 30 17:35:03 cds1019 kernel: em1: Tx Descriptors avail failure = 0
Oct 30 17:35:03 cds1019 kernel: em1: RX discarded packets = 0
Oct 30 17:35:03 cds1019 kernel: em1: RX Next to Check = 19
Oct 30 17:35:03 cds1019 kernel: em1: RX Next to Refresh = 85
Oct 30 17:35:03 cds1019 kernel: em1: Link state: active

net.inet.ip.intr_queue_maxlen: 512
net.inet.ip.intr_queue_drops: 0
dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.0.%driver: em
dev.em.0.%location: slot=0 function=0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x15d9
subdevice=0x10d3 class=0x020000
dev.em.0.%parent: pci1
dev.em.0.nvm: -1
dev.em.0.debug: -1
dev.em.0.rx_int_delay: 0
dev.em.0.tx_int_delay: 66
dev.em.0.rx_abs_int_delay: 66
dev.em.0.tx_abs_int_delay: 66
dev.em.0.rx_processing_limit: 100
dev.em.0.flow_control: 0
dev.em.0.eee_control: 0
dev.em.0.link_irq: 1
dev.em.0.mbuf_alloc_fail: 0
dev.em.0.cluster_alloc_fail: 0
dev.em.0.dropped: 0
dev.em.0.tx_dma_fail: 0
dev.em.0.rx_overruns: 0
dev.em.0.watchdog_timeouts: 0
dev.em.0.device_control: 1074790984
dev.em.0.rx_control: 67141634
dev.em.0.fc_high_water: 18432
dev.em.0.fc_low_water: 16932
dev.em.0.queue0.txd_head: 818
dev.em.0.queue0.txd_tail: 818
dev.em.0.queue0.tx_irq: 1220810942
dev.em.0.queue0.no_desc_avail: 0
dev.em.0.queue0.rxd_head: 314
dev.em.0.queue0.rxd_tail: 313
dev.em.0.queue0.rx_irq: 24614539
dev.em.0.mac_stats.excess_coll: 0
dev.em.0.mac_stats.single_coll: 0
dev.em.0.mac_stats.multiple_coll: 0
dev.em.0.mac_stats.late_coll: 0
dev.em.0.mac_stats.collision_count: 0
dev.em.0.mac_stats.symbol_errors: 0
dev.em.0.mac_stats.sequence_errors: 0
dev.em.0.mac_stats.defer_count: 0
dev.em.0.mac_stats.missed_packets: 0
dev.em.0.mac_stats.recv_no_buff: 0
dev.em.0.mac_stats.recv_undersize: 0
dev.em.0.mac_stats.recv_fragmented: 0
dev.em.0.mac_stats.recv_oversize: 0
dev.em.0.mac_stats.recv_jabber: 0
dev.em.0.mac_stats.recv_errs: 0
dev.em.0.mac_stats.crc_errs: 0
dev.em.0.mac_stats.alignment_errs: 0
dev.em.0.mac_stats.coll_ext_errs: 0
dev.em.0.mac_stats.xon_recvd: 0
dev.em.0.mac_stats.xon_txd: 0
dev.em.0.mac_stats.xoff_recvd: 0
dev.em.0.mac_stats.xoff_txd: 0
dev.em.0.mac_stats.total_pkts_recvd: 24955229
dev.em.0.mac_stats.good_pkts_recvd: 24955229
dev.em.0.mac_stats.bcast_pkts_recvd: 24945466
dev.em.0.mac_stats.mcast_pkts_recvd: 1377
dev.em.0.mac_stats.rx_frames_64: 24945502
dev.em.0.mac_stats.rx_frames_65_127: 1606
dev.em.0.mac_stats.rx_frames_128_255: 7963
dev.em.0.mac_stats.rx_frames_256_511: 157
dev.em.0.mac_stats.rx_frames_512_1023: 1
dev.em.0.mac_stats.rx_frames_1024_1522: 0
dev.em.0.mac_stats.good_octets_recvd: 1597800626
dev.em.0.mac_stats.good_octets_txd: 7770925595880
dev.em.0.mac_stats.total_pkts_txd: 5734191501
dev.em.0.mac_stats.good_pkts_txd: 5734191501
dev.em.0.mac_stats.bcast_pkts_txd: 4
dev.em.0.mac_stats.mcast_pkts_txd: 1584
dev.em.0.mac_stats.tx_frames_64: 25504167
dev.em.0.mac_stats.tx_frames_65_127: 485744914
dev.em.0.mac_stats.tx_frames_128_255: 5963478
dev.em.0.mac_stats.tx_frames_256_511: 9021090
dev.em.0.mac_stats.tx_frames_512_1023: 55803019
dev.em.0.mac_stats.tx_frames_1024_1522: 5152154833
dev.em.0.mac_stats.tso_txd: 0
dev.em.0.mac_stats.tso_ctx_fail: 0
dev.em.0.interrupts.asserts: 3
dev.em.0.interrupts.rx_pkt_timer: 0
dev.em.0.interrupts.rx_abs_timer: 0
dev.em.0.interrupts.tx_pkt_timer: 0
dev.em.0.interrupts.tx_abs_timer: 0
dev.em.0.interrupts.tx_queue_empty: 0
dev.em.0.interrupts.tx_queue_min_thresh: 0
dev.em.0.interrupts.rx_desc_min_thresh: 0
dev.em.0.interrupts.rx_overrun: 0
dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
dev.em.1.%driver: em
dev.em.1.%location: slot=0 function=0
dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x15d9
subdevice=0x10d3 class=0x020000
dev.em.1.%parent: pci2
dev.em.1.nvm: -1
dev.em.1.debug: -1
dev.em.1.rx_int_delay: 0
dev.em.1.tx_int_delay: 66
dev.em.1.rx_abs_int_delay: 66
dev.em.1.tx_abs_int_delay: 66
dev.em.1.rx_processing_limit: 100
dev.em.1.flow_control: 0
dev.em.1.eee_control: 0
dev.em.1.link_irq: 3272
dev.em.1.mbuf_alloc_fail: 0
dev.em.1.cluster_alloc_fail: 0
dev.em.1.dropped: 0
dev.em.1.tx_dma_fail: 0
dev.em.1.rx_overruns: 0
dev.em.1.watchdog_timeouts: 0
dev.em.1.device_control: 1074790984
dev.em.1.rx_control: 67141634
dev.em.1.fc_high_water: 18432
dev.em.1.fc_low_water: 16932
dev.em.1.queue0.txd_head: 9
dev.em.1.queue0.txd_tail: 14
dev.em.1.queue0.tx_irq: 1194045687
dev.em.1.queue0.no_desc_avail: 0
dev.em.1.queue0.rxd_head: 317
dev.em.1.queue0.rxd_tail: 315
dev.em.1.queue0.rx_irq: 1528821076
dev.em.1.mac_stats.excess_coll: 0
dev.em.1.mac_stats.single_coll: 0
dev.em.1.mac_stats.multiple_coll: 0
dev.em.1.mac_stats.late_coll: 0
dev.em.1.mac_stats.collision_count: 0
dev.em.1.mac_stats.symbol_errors: 0
dev.em.1.mac_stats.sequence_errors: 0
dev.em.1.mac_stats.defer_count: 0
dev.em.1.mac_stats.missed_packets: 15996
dev.em.1.mac_stats.recv_no_buff: 1035
dev.em.1.mac_stats.recv_undersize: 0
dev.em.1.mac_stats.recv_fragmented: 0
dev.em.1.mac_stats.recv_oversize: 0
dev.em.1.mac_stats.recv_jabber: 0
dev.em.1.mac_stats.recv_errs: 0
dev.em.1.mac_stats.crc_errs: 0
dev.em.1.mac_stats.alignment_errs: 0
dev.em.1.mac_stats.coll_ext_errs: 0
dev.em.1.mac_stats.xon_recvd: 0
dev.em.1.mac_stats.xon_txd: 0
dev.em.1.mac_stats.xoff_recvd: 0
dev.em.1.mac_stats.xoff_txd: 0
dev.em.1.mac_stats.total_pkts_recvd: 5220907923
dev.em.1.mac_stats.good_pkts_recvd: 5220891927
dev.em.1.mac_stats.bcast_pkts_recvd: 24943828
dev.em.1.mac_stats.mcast_pkts_recvd: 1375
dev.em.1.mac_stats.rx_frames_64: 1897964107
dev.em.1.mac_stats.rx_frames_65_127: 2457591621
dev.em.1.mac_stats.rx_frames_128_255: 5971619
dev.em.1.mac_stats.rx_frames_256_511: 11235726
dev.em.1.mac_stats.rx_frames_512_1023: 32267871
dev.em.1.mac_stats.rx_frames_1024_1522: 815860983
dev.em.1.mac_stats.good_octets_recvd: 1562681996232
dev.em.1.mac_stats.good_octets_txd: 7334946827215
dev.em.1.mac_stats.total_pkts_txd: 5430018684
dev.em.1.mac_stats.good_pkts_txd: 5430018682
dev.em.1.mac_stats.bcast_pkts_txd: 940
dev.em.1.mac_stats.mcast_pkts_txd: 12
dev.em.1.mac_stats.tx_frames_64: 24843579
dev.em.1.mac_stats.tx_frames_65_127: 483343565
dev.em.1.mac_stats.tx_frames_128_255: 5819327
dev.em.1.mac_stats.tx_frames_256_511: 7998506
dev.em.1.mac_stats.tx_frames_512_1023: 44787728
dev.em.1.mac_stats.tx_frames_1024_1522: 4863225979
dev.em.1.mac_stats.tso_txd: 0
dev.em.1.mac_stats.tso_ctx_fail: 0
dev.em.1.interrupts.asserts: 2398
dev.em.1.interrupts.rx_pkt_timer: 1
dev.em.1.interrupts.rx_abs_timer: 0
dev.em.1.interrupts.tx_pkt_timer: 0
dev.em.1.interrupts.tx_abs_timer: 0
dev.em.1.interrupts.tx_queue_empty: 0
dev.em.1.interrupts.tx_queue_min_thresh: 0
dev.em.1.interrupts.rx_desc_min_thresh: 0
dev.em.1.interrupts.rx_overrun: 2
hw.em.eee_setting: 0
hw.em.fc_setting: 0
hw.em.rx_process_limit: 100
hw.em.enable_msix: 1
hw.em.sbp: 0
hw.em.smart_pwr_down: 0
hw.em.txd: 1024
hw.em.rxd: 1024
hw.em.rx_abs_int_delay: 66
hw.em.tx_abs_int_delay: 66
hw.em.rx_int_delay: 0
hw.em.tx_int_delay: 66

Jason


More information about the freebsd-net mailing list