[Bug 195078] New: em tx_dma_fails and dropped packets

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Sun Nov 16 19:29:11 UTC 2014


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=195078

            Bug ID: 195078
           Summary: em tx_dma_fails and dropped packets
           Product: Base System
           Version: 9.2-RELEASE
          Hardware: Any
                OS: Any
            Status: Needs Triage
          Severity: Affects Many People
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs at FreeBSD.org
          Reporter: fusionfoto at gmail.com

It looks like FreeBSD may be a victim of this bug. This likely affects all
FreeBSD versions that have defaulted to a higher dev.em.rxd, which could be
several. 

I've turned tso on my running machine because I didn't want to reboot which
solved one set of problems, and then had to increase the rx_processing
threshold to hopefully solve the remaining packet drops. 

I have another couple of machines scheduled to reboot with dev.em.rxd/txd set
to 256 which I think is the old value, and hopefully I'll be able to set the
rest of the sysctls back to normal.

Hope this helps.


---



http://www.intel.com.au/content/dam/www/public/us/en/documents/specification-updates/82574-gbe-controller-spec-update.pdf



17. Tx Data Corruption When Using TCP Segmentation Offload

Problem: When using TSO, a situation can occur where a PCIe MRd request is
repeated with the

same address, resulting in data corruption. At the end of the TCP packet, the
Tx DMA

hangs because the length doesn't match. This can only occur when the following
are

true:

• The first buffer of the packet is larger than [3 * (max_read_request - 4)].

• There is a 4 KB boundary within 64 bytes following the end of the header
bytes in

the buffer

Implication: Possible data corruption since a TCP packet is transmitted
containing the wrong data but

with the correct checksum.

Data transmission halts as the Tx DMA module enters a hang state.

Workaround: The failure can be avoided by ensuring at least one of the
following:

• The buffer containing the headers should not be larger than [3 *

(max_read_request - 4)]. To meet this requirement even for the minimum value of

128 bytes for max_read_request, the buffer should not be larger than 372 bytes.

• The alignment of the buffer containing the headers should be such that there
is no

4 KB boundary within 64 bytes following the end of the header bytes. Assuming

standard Ethernet/IP/TCP headers of 54 bytes, this means that the buffer should

not start 54-118 bytes before a 4 KB boundary. For example, 128-byte alignment

for this buffer could be used to fulfill this condition.

This problem has not been reported when using an Intel Linux* or Windows*
drivers.

Current analysis shows it is very unlikely for a situation to exist that would
cause the

82574 to be at risk for the errata when using the Intel Linux or Windows
drivers.

Linux and other distros seem to have fixed it. This could be getting exercised
because FreeBSD recently changed the default buffer size above 256 for this
driver.

**** my comments below ****

Since I didn't want to reboot to try the lower buffer size, I turned off TSO on
all the machines that I'd checked that were actively incrementing tx_dma_fail
for em interfaces then re-enabled their membership into the LACP.


In brief testing, (few gigabits for a few minutes) tx_dma_fail has not
incremented and throughput has not been negatively impacted (before vs after
re-enable).

On Thu, Nov 13, 2014 at 1:52 PM, FF <fusionfoto at gmail.com> wrote:


    What knob do I need to turn to address this?

    This em0 is in an LACP bundle with an igb0 that isn't showing this problem.

    dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.3.8
    dev.em.0.%driver: em
    dev.em.0.%location: slot=25 function=0 handle=\_SB_.PCI0.GLAN
    dev.em.0.%pnpinfo: vendor=0x8086 device=0x153b subvendor=0x15d9
subdevice=0x153b class=0x020000
    dev.em.0.%parent: pci0
    dev.em.0.nvm: -1
    dev.em.0.debug: -1
    dev.em.0.fc: 3
    dev.em.0.rx_int_delay: 0
    dev.em.0.tx_int_delay: 66
    dev.em.0.rx_abs_int_delay: 66
    dev.em.0.tx_abs_int_delay: 66
    dev.em.0.itr: 488
    dev.em.0.rx_processing_limit: 100
    dev.em.0.eee_control: 1
    dev.em.0.link_irq: 0
    dev.em.0.mbuf_alloc_fail: 52
    dev.em.0.cluster_alloc_fail: 0
    dev.em.0.dropped: 0
    **
    dev.em.0.tx_dma_fail: 1834648
    dev.em.0.rx_overruns: 3109
    **
    dev.em.0.watchdog_timeouts: 0
    dev.em.0.device_control: 1209532992
    dev.em.0.rx_control: 67141634
    dev.em.0.fc_high_water: 23584
    dev.em.0.fc_low_water: 20552
    dev.em.0.queue0.txd_head: 577
    dev.em.0.queue0.txd_tail: 577
    dev.em.0.queue0.tx_irq: 0
    dev.em.0.queue0.no_desc_avail: 0
    dev.em.0.queue0.rxd_head: 967
    dev.em.0.queue0.rxd_tail: 966
    dev.em.0.queue0.rx_irq: 0
    dev.em.0.mac_stats.excess_coll: 0
    dev.em.0.mac_stats.single_coll: 0
    dev.em.0.mac_stats.multiple_coll: 0
    dev.em.0.mac_stats.late_coll: 0
    dev.em.0.mac_stats.collision_count: 0
    dev.em.0.mac_stats.symbol_errors: 0
    dev.em.0.mac_stats.sequence_errors: 0
    dev.em.0.mac_stats.defer_count: 0
    dev.em.0.mac_stats.missed_packets: 61094
    dev.em.0.mac_stats.recv_no_buff: 60008
    dev.em.0.mac_stats.recv_undersize: 0
    dev.em.0.mac_stats.recv_fragmented: 0
    dev.em.0.mac_stats.recv_oversize: 0
    dev.em.0.mac_stats.recv_jabber: 0
    dev.em.0.mac_stats.recv_errs: 0
    dev.em.0.mac_stats.crc_errs: 0
    dev.em.0.mac_stats.alignment_errs: 0
    dev.em.0.mac_stats.coll_ext_errs: 0
    dev.em.0.mac_stats.xon_recvd: 40226659
    dev.em.0.mac_stats.xon_txd: 2132
    dev.em.0.mac_stats.xoff_recvd: 40241216
    dev.em.0.mac_stats.xoff_txd: 2073563
    dev.em.0.mac_stats.total_pkts_recvd: 3219537541
    dev.em.0.mac_stats.good_pkts_recvd: 3139008594
    dev.em.0.mac_stats.bcast_pkts_recvd: 3953817
    dev.em.0.mac_stats.mcast_pkts_recvd: 607157
    dev.em.0.mac_stats.rx_frames_64: 0
    dev.em.0.mac_stats.rx_frames_65_127: 0
    dev.em.0.mac_stats.rx_frames_128_255: 0
    dev.em.0.mac_stats.rx_frames_256_511: 0
    dev.em.0.mac_stats.rx_frames_512_1023: 0
    dev.em.0.mac_stats.rx_frames_1024_1522: 0
    dev.em.0.mac_stats.good_octets_recvd: 3527296369841
    dev.em.0.mac_stats.good_octets_txd: 14348531993101
    dev.em.0.mac_stats.total_pkts_txd: 10735190291
    dev.em.0.mac_stats.good_pkts_txd: 10733114595
    dev.em.0.mac_stats.bcast_pkts_txd: 14
    dev.em.0.mac_stats.mcast_pkts_txd: 54334
    dev.em.0.mac_stats.tx_frames_64: 0
    dev.em.0.mac_stats.tx_frames_65_127: 0
    dev.em.0.mac_stats.tx_frames_128_255: 0
    dev.em.0.mac_stats.tx_frames_256_511: 0
    dev.em.0.mac_stats.tx_frames_512_1023: 0
    dev.em.0.mac_stats.tx_frames_1024_1522: 0
    dev.em.0.mac_stats.tso_txd: 902605586
    dev.em.0.mac_stats.tso_ctx_fail: 0
    dev.em.0.interrupts.asserts: 1392541431
    dev.em.0.interrupts.rx_pkt_timer: 0
    dev.em.0.interrupts.rx_abs_timer: 0
    dev.em.0.interrupts.tx_pkt_timer: 0
    dev.em.0.interrupts.tx_abs_timer: 0
    dev.em.0.interrupts.tx_queue_empty: 0
    dev.em.0.interrupts.tx_queue_min_thresh: 0
    dev.em.0.interrupts.rx_desc_min_thresh: 0
    dev.em.0.interrupts.rx_overrun: 0
    dev.em.0.wake: 0

    dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 2.3.10
    dev.igb.0.%driver: igb
    dev.igb.0.%location: slot=0 function=0 handle=\_SB_.PCI0.RP04.PXSX
    dev.igb.0.%pnpinfo: vendor=0x8086 device=0x1533 subvendor=0x15d9
subdevice=0x1533 class=0x020000
    dev.igb.0.%parent: pci5
    dev.igb.0.nvm: -1
    dev.igb.0.enable_aim: 1
    dev.igb.0.fc: 3
    dev.igb.0.rx_processing_limit: 100
    dev.igb.0.dmac: 0
    dev.igb.0.eee_disabled: 0
    dev.igb.0.link_irq: 33
    dev.igb.0.dropped: 0
    dev.igb.0.tx_dma_fail: 0
    dev.igb.0.rx_overruns: 0
    dev.igb.0.watchdog_timeouts: 0
    dev.igb.0.device_control: 1209795137
    dev.igb.0.rx_control: 71335938
    dev.igb.0.interrupt_mask: 4
    dev.igb.0.extended_int_mask: 2147483679
    dev.igb.0.tx_buf_alloc: 0
    dev.igb.0.rx_buf_alloc: 0
    dev.igb.0.fc_high_water: 31328
    dev.igb.0.fc_low_water: 31312
    dev.igb.0.queue0.no_desc_avail: 0
    dev.igb.0.queue0.tx_packets: 62464141
    dev.igb.0.queue0.rx_packets: 73012939
    dev.igb.0.queue0.rx_bytes: 22529663814
    dev.igb.0.queue0.lro_queued: 0
    dev.igb.0.queue0.lro_flushed: 0
    dev.igb.0.queue1.no_desc_avail: 0
    dev.igb.0.queue1.tx_packets: 404298046
    dev.igb.0.queue1.rx_packets: 307675818
    dev.igb.0.queue1.rx_bytes: 185919902229
    dev.igb.0.queue1.lro_queued: 0
    dev.igb.0.queue1.lro_flushed: 0
    dev.igb.0.queue2.no_desc_avail: 0
    dev.igb.0.queue2.tx_packets: 3441053015
    dev.igb.0.queue2.rx_packets: 5511826751
    dev.igb.0.queue2.rx_bytes: 3054219311510
    dev.igb.0.queue2.lro_queued: 0
    dev.igb.0.queue2.lro_flushed: 0
    dev.igb.0.queue3.no_desc_avail: 0
    dev.igb.0.queue3.tx_packets: 1047838830
    dev.igb.0.queue3.rx_packets: 1987495318
    dev.igb.0.queue3.rx_bytes: 2696179247028
    dev.igb.0.queue3.lro_queued: 0
    dev.igb.0.queue3.lro_flushed: 0
    dev.igb.0.mac_stats.excess_coll: 0
    dev.igb.0.mac_stats.single_coll: 0
    dev.igb.0.mac_stats.multiple_coll: 0
    dev.igb.0.mac_stats.late_coll: 0
    dev.igb.0.mac_stats.collision_count: 0
    dev.igb.0.mac_stats.symbol_errors: 0
    dev.igb.0.mac_stats.sequence_errors: 0
    dev.igb.0.mac_stats.defer_count: 283811
    dev.igb.0.mac_stats.missed_packets: 9449
    dev.igb.0.mac_stats.recv_no_buff: 340
    dev.igb.0.mac_stats.recv_undersize: 0
    dev.igb.0.mac_stats.recv_fragmented: 0
    dev.igb.0.mac_stats.recv_oversize: 0
    dev.igb.0.mac_stats.recv_jabber: 0
    dev.igb.0.mac_stats.recv_errs: 0
    dev.igb.0.mac_stats.crc_errs: 0
    dev.igb.0.mac_stats.alignment_errs: 0
    dev.igb.0.mac_stats.coll_ext_errs: 0
    dev.igb.0.mac_stats.xon_recvd: 46255557
    dev.igb.0.mac_stats.xon_txd: 261
    dev.igb.0.mac_stats.xoff_recvd: 46255994
    dev.igb.0.mac_stats.xoff_txd: 7027
    dev.igb.0.mac_stats.total_pkts_recvd: 7975033582
    dev.igb.0.mac_stats.good_pkts_recvd: 7880001465
    dev.igb.0.mac_stats.bcast_pkts_recvd: 5783868
    dev.igb.0.mac_stats.mcast_pkts_recvd: 563315
    dev.igb.0.mac_stats.rx_frames_64: 28412906
    dev.igb.0.mac_stats.rx_frames_65_127: 3310187919
    dev.igb.0.mac_stats.rx_frames_128_255: 784920450
    dev.igb.0.mac_stats.rx_frames_256_511: 17225962
    dev.igb.0.mac_stats.rx_frames_512_1023: 73415350
    dev.igb.0.mac_stats.rx_frames_1024_1522: 3665838878
    dev.igb.0.mac_stats.good_octets_recvd: 5990356613544
    dev.igb.0.mac_stats.good_octets_txd: 46326753008181
    dev.igb.0.mac_stats.total_pkts_txd: 33016014138
    dev.igb.0.mac_stats.good_pkts_txd: 33016006850
    dev.igb.0.mac_stats.bcast_pkts_txd: 834
    dev.igb.0.mac_stats.mcast_pkts_txd: 54331
    dev.igb.0.mac_stats.tx_frames_64: 30741691
    dev.igb.0.mac_stats.tx_frames_65_127: 2174824217
    dev.igb.0.mac_stats.tx_frames_128_255: 139804927
    dev.igb.0.mac_stats.tx_frames_256_511: 59190261
    dev.igb.0.mac_stats.tx_frames_512_1023: 386886648
    dev.igb.0.mac_stats.tx_frames_1024_1522: 30224559106
    dev.igb.0.mac_stats.tso_txd: 2384636909
    dev.igb.0.mac_stats.tso_ctx_fail: 0
    dev.igb.0.interrupts.asserts: 4556119857
    dev.igb.0.interrupts.rx_pkt_timer: 7879778770
    dev.igb.0.interrupts.rx_abs_timer: 0
    dev.igb.0.interrupts.tx_pkt_timer: 0
    dev.igb.0.interrupts.tx_abs_timer: 0
    dev.igb.0.interrupts.tx_queue_empty: 33015268817
    dev.igb.0.interrupts.tx_queue_min_thresh: 7880001470
    dev.igb.0.interrupts.rx_desc_min_thresh: 0
    dev.igb.0.interrupts.rx_overrun: 0
    dev.igb.0.host.breaker_tx_pkt: 0
    dev.igb.0.host.host_tx_pkt_discard: 0
    dev.igb.0.host.rx_pkt: 222702
    dev.igb.0.host.breaker_rx_pkts: 0
    dev.igb.0.host.breaker_rx_pkt_drop: 0
    dev.igb.0.host.tx_good_pkt: 738033
    dev.igb.0.host.breaker_tx_pkt_drop: 0
    dev.igb.0.host.rx_good_bytes: 5990357073320
    dev.igb.0.host.tx_good_bytes: 46326753008181
    dev.igb.0.host.length_errors: 0
    dev.igb.0.host.serdes_violation_pkt: 0
    dev.igb.0.host.header_redir_missed: 0
    dev.igb.0.wake: 0


    hw.em.eee_setting: 1
    hw.em.rx_process_limit: 100
    hw.em.enable_msix: 1
    hw.em.sbp: 0
    hw.em.smart_pwr_down: 0
    hw.em.txd: 1024
    hw.em.rxd: 1024
    hw.em.rx_abs_int_delay: 66
    hw.em.tx_abs_int_delay: 66
    hw.em.rx_int_delay: 0
    hw.em.tx_int_delay: 66

    hw.igb.rx_process_limit: 100
    hw.igb.num_queues: 0
    hw.igb.header_split: 0
    hw.igb.buf_ring_size: 4096
    hw.igb.max_interrupt_rate: 8000
    hw.igb.enable_msix: 1
    hw.igb.enable_aim: 1
    hw.igb.txd: 1024
    hw.igb.rxd: 1024

    FreeBSD systemname.com 9.2-RELEASE-p10 FreeBSD 9.2-RELEASE-p10 #0 r270148M:
Mon Aug 18 23:14:36 EDT 2014     root at peta108:/usr/obj/usr/src/sys/CUSTOM10 
amd64

    em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
           
options=4019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
            ether 00:25:90:f2:2d:24
            inet6 fe80::225:90ff:fef2:2d24%em0 prefixlen 64 scopeid 0x2
            nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
            media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
    igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
           
options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
            ether 00:25:90:f2:2d:24
            inet6 fe80::225:90ff:fef2:2d25%igb0 prefixlen 64 scopeid 0x4
            nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
            media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
    lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
            options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
            inet6 ::1 prefixlen 128
            inet6 fe80::1%lo0 prefixlen 64 scopeid 0x7
            inet 127.0.0.1 netmask 0xff000000
            nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
    lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
           
options=4019b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
            ether 00:25:90:f2:2d:24
            inet 192.168.0.108 netmask 0xffffff00 broadcast 192.168.0.255
            inet6 fe80::225:90ff:fef2:2d24%lagg0 prefixlen 64 scopeid 0x8
            nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
            media: Ethernet autoselect
            status: active
            laggproto lacp lagghash l2,l3,l4
            laggport: igb0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
            laggport: em0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>

    Thanks in advance!

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list