Re: igc problems with heavy traffic (update)

From: John Fieber <jrf_at_ursamaris.org>
Date: Sun, 25 Sep 2022 00:30:16 UTC
> On Sep 14, 2022, at 8:03 AM, mike tancsa <mike@sentex.net> wrote:
> 
> OK, an update hence the top post. I got a new pair of boxes which use a different Jasper Lake chipset and have i226-V vs the i225 of the previous box.
> 
> dev.igc.0.%parent: pci2
> dev.igc.0.%pnpinfo: vendor=0x8086 device=0x125c subvendor=0x8086 subdevice=0x0000 class=0x020000
> dev.igc.0.%location: slot=0 function=0 dbsf=pci0:2:0:0 handle=\_SB_.PC00.RP05.PXSX
> dev.igc.0.%driver: igc
> dev.igc.0.%desc: Intel(R) Ethernet Controller I226-V
> dev.igc.%parent:
> 
> WIth a default RELENG_13, out of the box with no tweaks, I am NOT able to cause the transmitting nic to bounce with heave traffic. I used the same test script (a constant stream of iperf3 alternating in direction) maxing out the NIC's bandwidth and all seems fine running the test for some 18hrs.  Maybe something different about the i225 version of this NIC that needs some different driver defaults ?
> 
>     ---Mike
> 

I also see this behavior with 13.1-RELEASE-p2 on:

CPU: Intel(R) Celeron(R) J4125 CPU @ 2.00GHz (1996.80-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x706a8  Family=0x6  Model=0x7a  Stepping=8

NIC (x4):

dev.igc.0.%parent: pci1
dev.igc.0.%pnpinfo: vendor=0x8086 device=0x15f3 subvendor=0x8086 subdevice=0x0000 class=0x020000
dev.igc.0.%location: slot=0 function=0 dbsf=pci0:1:0:0 handle=\_SB_.PCI0.RP03.PXSX
dev.igc.0.%driver: igc
dev.igc.0.%desc: Intel(R) Ethernet Controller I225-V

Twidding EEE doesn’t seem to affect it, disabling flow control helps a bit, but not really a meaningful amount.

Tests were done through a tp-link TL-SG3210XHP-M2 switch, with the other party being 13.1-RELEASE-p2 on a 10gb DAC connection (ixl driver).

For comparison, loading up a variety of things in bhyve (with pci pass through of a nic) these all showed the same problem, with the interface bouncing multiple times inside of a 5-minute iperf3 test, same as the host:

- FreeBSD-13.1-STABLE-amd64-20220923
- OPNsense-22.7
- pfSense-CE 2.7-DEVLOPMENT-latest

These, however, offer unflappable performance:

- FreeBSD-14.0-CURRENT-amd64-20220923
- vyos-1.4 (for reference, what I mostly use on this hardware, via bhyve)

-john


> 
> On 8/12/2022 11:04 AM, mike tancsa wrote:
>> 
>> On 8/10/2022 3:53 PM, mike tancsa wrote:
>>> On 8/10/2022 1:47 PM, Pieper, Jeffrey E wrote:
>>>> 
>>>> You could try disabling EEE (Energy Efficient Ethernet). Something like: sysctl dev.igc.0.eee_control=0.
>>> 
>>> 
>>> It does not seem to make a difference. If I have the FC as default, I get the link bounce on the 2.5G xover (cat 6 cable) maybe 2-3 min in running iper3 tests.  However, if I disable all flow control
>>> 
>>> dev.igc.0.fc=0
>>> dev.igc.1.fc=0
>>> dev.igc.2.fc=0
>>> dev.igc.3.fc=0
>>> 
>>> It *seems* to be less frequent but still happens.  I ordered a 2.5 G switch so I can try and at least see which side is dropping the link. Should have it Friday to continue testing
>>> 
>> 
>> OK, I repeated the tests with a 2.5G unmanaged switch in between the two units rather than xover. It looks like its the server that is sending the majority of the packets that drops the link, not the receiver.
>> 
>> One other test I did was to up hw.igc.max_interrupt_rate=13000 from the default of 8000. That seems to make the problem MUCH more acute.
>> 
>> Here is the before and after of the link drop.
>> 
>>  dev.igc.1.wake: 0
>>  dev.igc.1.interrupts.rx_desc_min_thresh: 0
>> -dev.igc.1.interrupts.asserts: 65
>> +dev.igc.1.interrupts.asserts: 4879479
>>  dev.igc.1.mac_stats.tso_txd: 0
>> -dev.igc.1.mac_stats.tx_frames_1024_1522: 3
>> -dev.igc.1.mac_stats.tx_frames_512_1023: 1
>> -dev.igc.1.mac_stats.tx_frames_256_511: 2
>> -dev.igc.1.mac_stats.tx_frames_128_255: 15
>> -dev.igc.1.mac_stats.tx_frames_65_127: 2
>> +dev.igc.1.mac_stats.tx_frames_1024_1522: 12973065
>> +dev.igc.1.mac_stats.tx_frames_512_1023: 58
>> +dev.igc.1.mac_stats.tx_frames_256_511: 107
>> +dev.igc.1.mac_stats.tx_frames_128_255: 1215725
>> +dev.igc.1.mac_stats.tx_frames_65_127: 192
>>  dev.igc.1.mac_stats.tx_frames_64: 1
>>  dev.igc.1.mac_stats.mcast_pkts_txd: 0
>>  dev.igc.1.mac_stats.bcast_pkts_txd: 1
>> -dev.igc.1.mac_stats.good_pkts_txd: 24
>> -dev.igc.1.mac_stats.total_pkts_txd: 24
>> -dev.igc.1.mac_stats.good_octets_txd: 7674
>> -dev.igc.1.mac_stats.good_octets_recvd: 6492
>> -dev.igc.1.mac_stats.rx_frames_1024_1522: 2
>> -dev.igc.1.mac_stats.rx_frames_512_1023: 1
>> -dev.igc.1.mac_stats.rx_frames_256_511: 2
>> -dev.igc.1.mac_stats.rx_frames_128_255: 15
>> -dev.igc.1.mac_stats.rx_frames_65_127: 2
>> +dev.igc.1.mac_stats.good_pkts_txd: 14189148
>> +dev.igc.1.mac_stats.total_pkts_txd: 14189148
>> +dev.igc.1.mac_stats.good_octets_txd: 19450753554
>> +dev.igc.1.mac_stats.good_octets_recvd: 14933399426
>> +dev.igc.1.mac_stats.rx_frames_1024_1522: 9823228
>> +dev.igc.1.mac_stats.rx_frames_512_1023: 3
>> +dev.igc.1.mac_stats.rx_frames_256_511: 62
>> +dev.igc.1.mac_stats.rx_frames_128_255: 2365665
>> +dev.igc.1.mac_stats.rx_frames_65_127: 213
>>  dev.igc.1.mac_stats.rx_frames_64: 1
>>  dev.igc.1.mac_stats.mcast_pkts_recvd: 0
>>  dev.igc.1.mac_stats.bcast_pkts_recvd: 0
>> -dev.igc.1.mac_stats.good_pkts_recvd: 23
>> -dev.igc.1.mac_stats.total_pkts_recvd: 23
>> +dev.igc.1.mac_stats.good_pkts_recvd: 12189172
>> +dev.igc.1.mac_stats.total_pkts_recvd: 12189172
>>  dev.igc.1.mac_stats.xoff_txd: 0
>>  dev.igc.1.mac_stats.xoff_recvd: 0
>>  dev.igc.1.mac_stats.xon_txd: 0
>>  dev.igc.1.mac_stats.single_coll: 0
>>  dev.igc.1.mac_stats.excess_coll: 0
>>  dev.igc.1.queue_rx_3.rx_irq: 0
>> -dev.igc.1.queue_rx_3.rxd_tail: 21
>> -dev.igc.1.queue_rx_3.rxd_head: 22
>> +dev.igc.1.queue_rx_3.rxd_tail: 498
>> +dev.igc.1.queue_rx_3.rxd_head: 499
>>  dev.igc.1.queue_rx_2.rx_irq: 0
>>  dev.igc.1.queue_rx_2.rxd_tail: 128
>>  dev.igc.1.queue_rx_2.rxd_head: 0
>>  dev.igc.1.queue_rx_0.rxd_tail: 0
>>  dev.igc.1.queue_rx_0.rxd_head: 1
>>  dev.igc.1.queue_tx_3.tx_irq: 0
>> -dev.igc.1.queue_tx_3.txd_tail: 0
>> -dev.igc.1.queue_tx_3.txd_head: 0
>> +dev.igc.1.queue_tx_3.txd_tail: 746
>> +dev.igc.1.queue_tx_3.txd_head: 746
>>  dev.igc.1.queue_tx_2.tx_irq: 0
>> -dev.igc.1.queue_tx_2.txd_tail: 0
>> -dev.igc.1.queue_tx_2.txd_head: 0
>> +dev.igc.1.queue_tx_2.txd_tail: 186
>> +dev.igc.1.queue_tx_2.txd_head: 186
>>  dev.igc.1.queue_tx_1.tx_irq: 0
>> -dev.igc.1.queue_tx_1.txd_tail: 0
>> -dev.igc.1.queue_tx_1.txd_head: 0
>> +dev.igc.1.queue_tx_1.txd_tail: 520
>> +dev.igc.1.queue_tx_1.txd_head: 520
>>  dev.igc.1.queue_tx_0.tx_irq: 0
>> -dev.igc.1.queue_tx_0.txd_tail: 45
>> -dev.igc.1.queue_tx_0.txd_head: 45
>> +dev.igc.1.queue_tx_0.txd_tail: 777
>> +dev.igc.1.queue_tx_0.txd_head: 777
>>  dev.igc.1.fc_low_water: 32752
>>  dev.igc.1.fc_high_water: 32768
>>  dev.igc.1.rx_control: 71335938
>>  dev.igc.1.device_control: 404489793
>>  dev.igc.1.watchdog_timeouts: 0
>>  dev.igc.1.rx_overruns: 0
>> -dev.igc.1.link_irq: 2
>> +dev.igc.1.link_irq: 4
>>  dev.igc.1.dropped: 0
>>  dev.igc.1.eee_control: 0
>>  dev.igc.1.itr: 488
>>  dev.igc.1.nvm: -1
>>  dev.igc.1.iflib.rxq3.rxq_fl0.buf_size: 2048
>>  dev.igc.1.iflib.rxq3.rxq_fl0.credits: 1023
>> -dev.igc.1.iflib.rxq3.rxq_fl0.cidx: 22
>> -dev.igc.1.iflib.rxq3.rxq_fl0.pidx: 21
>> +dev.igc.1.iflib.rxq3.rxq_fl0.cidx: 499
>> +dev.igc.1.iflib.rxq3.rxq_fl0.pidx: 498
>>  dev.igc.1.iflib.rxq3.cpu: 3
>>  dev.igc.1.iflib.rxq2.rxq_fl0.buf_size: 2048
>>  dev.igc.1.iflib.rxq2.rxq_fl0.credits: 128
>>  dev.igc.1.iflib.txq3.r_abdications: 0
>>  dev.igc.1.iflib.txq3.r_restarts: 0
>>  dev.igc.1.iflib.txq3.r_stalls: 0
>> -dev.igc.1.iflib.txq3.r_starts: 0
>> +dev.igc.1.iflib.txq3.r_starts: 6175093
>>  dev.igc.1.iflib.txq3.r_drops: 0
>> -dev.igc.1.iflib.txq3.r_enqueues: 0
>> -dev.igc.1.iflib.txq3.ring_state: pidx_head: 0000 pidx_tail: 0000 cidx: 0000 state: IDLE
>> -dev.igc.1.iflib.txq3.txq_cleaned: 0
>> -dev.igc.1.iflib.txq3.txq_processed: 0
>> -dev.igc.1.iflib.txq3.txq_in_use: 0
>> -dev.igc.1.iflib.txq3.txq_cidx_processed: 0
>> -dev.igc.1.iflib.txq3.txq_cidx: 0
>> -dev.igc.1.iflib.txq3.txq_pidx: 0
>> +dev.igc.1.iflib.txq3.r_enqueues: 6175093
>> +dev.igc.1.iflib.txq3.ring_state: pidx_head: 0373 pidx_tail: 0373 cidx: 0373 state: IDLE
>> +dev.igc.1.iflib.txq3.txq_cleaned: 12350144
>> +dev.igc.1.iflib.txq3.txq_processed: 12350184
>> +dev.igc.1.iflib.txq3.txq_in_use: 42
>> +dev.igc.1.iflib.txq3.txq_cidx_processed: 744
>> +dev.igc.1.iflib.txq3.txq_cidx: 704
>> +dev.igc.1.iflib.txq3.txq_pidx: 746
>>  dev.igc.1.iflib.txq3.no_tx_dma_setup: 0
>>  dev.igc.1.iflib.txq3.txd_encap_efbig: 0
>>  dev.igc.1.iflib.txq3.tx_map_failed: 0
>>  dev.igc.1.iflib.txq2.r_abdications: 0
>>  dev.igc.1.iflib.txq2.r_restarts: 0
>>  dev.igc.1.iflib.txq2.r_stalls: 0
>> -dev.igc.1.iflib.txq2.r_starts: 0
>> +dev.igc.1.iflib.txq2.r_starts: 3421789
>>  dev.igc.1.iflib.txq2.r_drops: 0
>> -dev.igc.1.iflib.txq2.r_enqueues: 0
>> -dev.igc.1.iflib.txq2.ring_state: pidx_head: 0000 pidx_tail: 0000 cidx: 0000 state: IDLE
>> -dev.igc.1.iflib.txq2.txq_cleaned: 0
>> -dev.igc.1.iflib.txq2.txq_processed: 0
>> -dev.igc.1.iflib.txq2.txq_in_use: 0
>> -dev.igc.1.iflib.txq2.txq_cidx_processed: 0
>> -dev.igc.1.iflib.txq2.txq_cidx: 0
>> -dev.igc.1.iflib.txq2.txq_pidx: 0
>> +dev.igc.1.iflib.txq2.r_enqueues: 3421789
>> +dev.igc.1.iflib.txq2.ring_state: pidx_head: 1629 pidx_tail: 1629 cidx: 1629 state: IDLE
>> +dev.igc.1.iflib.txq2.txq_cleaned: 6843536
>> +dev.igc.1.iflib.txq2.txq_processed: 6843576
>> +dev.igc.1.iflib.txq2.txq_in_use: 42
>> +dev.igc.1.iflib.txq2.txq_cidx_processed: 184
>> +dev.igc.1.iflib.txq2.txq_cidx: 144
>> +dev.igc.1.iflib.txq2.txq_pidx: 186
>>  dev.igc.1.iflib.txq2.no_tx_dma_setup: 0
>>  dev.igc.1.iflib.txq2.txd_encap_efbig: 0
>>  dev.igc.1.iflib.txq2.tx_map_failed: 0
>>  dev.igc.1.iflib.txq1.r_abdications: 0
>>  dev.igc.1.iflib.txq1.r_restarts: 0
>>  dev.igc.1.iflib.txq1.r_stalls: 0
>> -dev.igc.1.iflib.txq1.r_starts: 0
>> +dev.igc.1.iflib.txq1.r_starts: 2734852
>>  dev.igc.1.iflib.txq1.r_drops: 0
>> -dev.igc.1.iflib.txq1.r_enqueues: 0
>> -dev.igc.1.iflib.txq1.ring_state: pidx_head: 0000 pidx_tail: 0000 cidx: 0000 state: IDLE
>> -dev.igc.1.iflib.txq1.txq_cleaned: 0
>> -dev.igc.1.iflib.txq1.txq_processed: 0
>> -dev.igc.1.iflib.txq1.txq_in_use: 0
>> -dev.igc.1.iflib.txq1.txq_cidx_processed: 0
>> -dev.igc.1.iflib.txq1.txq_cidx: 0
>> -dev.igc.1.iflib.txq1.txq_pidx: 0
>> +dev.igc.1.iflib.txq1.r_enqueues: 2734852
>> +dev.igc.1.iflib.txq1.ring_state: pidx_head: 0772 pidx_tail: 0772 cidx: 0772 state: IDLE
>> +dev.igc.1.iflib.txq1.txq_cleaned: 5469662
>> +dev.igc.1.iflib.txq1.txq_processed: 5469702
>> +dev.igc.1.iflib.txq1.txq_in_use: 42
>> +dev.igc.1.iflib.txq1.txq_cidx_processed: 518
>> +dev.igc.1.iflib.txq1.txq_cidx: 478
>> +dev.igc.1.iflib.txq1.txq_pidx: 520
>>  dev.igc.1.iflib.txq1.no_tx_dma_setup: 0
>>  dev.igc.1.iflib.txq1.txd_encap_efbig: 0
>>  dev.igc.1.iflib.txq1.tx_map_failed: 0
>>  dev.igc.1.iflib.txq0.r_abdications: 0
>>  dev.igc.1.iflib.txq0.r_restarts: 0
>>  dev.igc.1.iflib.txq0.r_stalls: 0
>> -dev.igc.1.iflib.txq0.r_starts: 24
>> +dev.igc.1.iflib.txq0.r_starts: 1857414
>>  dev.igc.1.iflib.txq0.r_drops: 0
>> -dev.igc.1.iflib.txq0.r_enqueues: 24
>> -dev.igc.1.iflib.txq0.ring_state: pidx_head: 0024 pidx_tail: 0024 cidx: 0024 state: IDLE
>> -dev.igc.1.iflib.txq0.txq_cleaned: 3
>> -dev.igc.1.iflib.txq0.txq_processed: 43
>> +dev.igc.1.iflib.txq0.r_enqueues: 1857414
>> +dev.igc.1.iflib.txq0.ring_state: pidx_head: 1926 pidx_tail: 1926 cidx: 1926 state: IDLE
>> +dev.igc.1.iflib.txq0.txq_cleaned: 3714783
>> +dev.igc.1.iflib.txq0.txq_processed: 3714823
>>  dev.igc.1.iflib.txq0.txq_in_use: 42
>> -dev.igc.1.iflib.txq0.txq_cidx_processed: 43
>> -dev.igc.1.iflib.txq0.txq_cidx: 3
>> -dev.igc.1.iflib.txq0.txq_pidx: 45
>> +dev.igc.1.iflib.txq0.txq_cidx_processed: 775
>> +dev.igc.1.iflib.txq0.txq_cidx: 735
>> +dev.igc.1.iflib.txq0.txq_pidx: 777
>>  dev.igc.1.iflib.txq0.no_tx_dma_setup: 0
>>  dev.igc.1.iflib.txq0.txd_encap_efbig: 0
>>  dev.igc.1.iflib.txq0.tx_map_failed: 0
>>  dev.igc.1.%desc: Intel(R) Ethernet Controller I225-V
>> 
>> Interface is RUNNING and ACTIVE
>> igc1: TX Queue 0 ------
>> igc1: hw tdh = 777, hw tdt = 777
>> igc1: TX Queue 1 ------
>> igc1: hw tdh = 520, hw tdt = 520
>> igc1: TX Queue 2 ------
>> igc1: hw tdh = 186, hw tdt = 186
>> igc1: TX Queue 3 ------
>> igc1: hw tdh = 746, hw tdt = 746
>> igc1: RX Queue 0 ------
>> igc1: hw rdh = 1, hw rdt = 0
>> igc1: RX Queue 1 ------
>> igc1: hw rdh = 0, hw rdt = 128
>> igc1: RX Queue 2 ------
>> igc1: hw rdh = 0, hw rdt = 128
>> igc1: RX Queue 3 ------
>> igc1: hw rdh = 499, hw rdt = 498
>> 
>> 
>> 
>