High CPU interrupt load on intel I350T4 with igb on 8.3

Barney Cordoba barney_cordoba at yahoo.com
Thu May 9 12:56:56 UTC 2013



--- On Sun, 4/28/13, Barney Cordoba <barney_cordoba at yahoo.com> wrote:

> From: Barney Cordoba <barney_cordoba at yahoo.com>
> Subject: Re: High CPU interrupt load on intel I350T4 with igb on 8.3
> To: "Jack Vogel" <jfvogel at gmail.com>
> Cc: "FreeBSD Net" <freebsd-net at freebsd.org>, "Clément Hermann (nodens)" <nodens2099 at gmail.com>
> Date: Sunday, April 28, 2013, 2:59 PM
> The point of lists is to be able to
> benefit from other's experiences so you don't have to waste
> your time "trying" things that others have already done.
> I'm not pontificating. I've done the tests. There's no
> reason for every person who is having to exact same problem
> to do the same tests over and over, hoping for somemagically
> different result. The result will always be the same.
> Because there's no chance of it working properly by
> chance.
> BC
> 
> 
> --- On Sun, 4/28/13, Jack Vogel <jfvogel at gmail.com>
> wrote:
> 
> From: Jack Vogel <jfvogel at gmail.com>
> Subject: Re: High CPU interrupt load on intel I350T4 with
> igb on 8.3
> To: "Barney Cordoba" <barney_cordoba at yahoo.com>
> Cc: "FreeBSD Net" <freebsd-net at freebsd.org>,
> "Clément Hermann (nodens)" <nodens2099 at gmail.com>
> Date: Sunday, April 28, 2013, 1:07 PM
> 
> Try setting your queues to 1, run some tests, then try
> settingyour queues to 2, then to 4... its called tuning, and
> rather thanjust pontificating about it, which Barney so
> loves to do, you can
> discover what works best. I ran tests last week preparing
> for anew driver version and found the best results came not
> only whiletweaking queues, but also ring size, and I could
> see changes based
> on the buf ring size....  There are lots of things that may
> improve ordegrade performance depending on the workload.
> Jack
> 
> 
> 
> On Sun, Apr 28, 2013 at 7:21 AM, Barney Cordoba <barney_cordoba at yahoo.com>
> wrote:
> 
> 
> 
> 
> 
> --- On Fri, 4/26/13, "Clément Hermann (nodens)" <nodens2099 at gmail.com>
> wrote:
> 
> 
> 
> > From: "Clément Hermann (nodens)" <nodens2099 at gmail.com>
> 
> > Subject: High CPU interrupt load on intel I350T4 with
> igb on 8.3
> 
> > To: freebsd-net at freebsd.org
> 
> > Date: Friday, April 26, 2013, 7:31 AM
> 
> > Hi list,
> 
> >
> 
> > We use pf+ALTQ for trafic shaping on some routers.
> 
> >%> 
> > We are switching to new servers : Dell PowerEdge R620
> with 2
> 
> > 8-cores Intel Processor (E5-2650L), 8GB RAM and Intel
> I350T4
> 
> > (quad port) using igb driver. The old hardware is using
> em
> 
> > driver, the CPU load is high but mostly due to kernel
> and a
> 
> > large pf ruleset.
> 
> >
> 
> > On the new hardware, we see high CPU Interrupt load (up
> to
> 
> > 95%), even though there is not much trafic currently
> (peaks
> 
> > about 150Mbps and 40Kpps). All queues are used and
> binded to
> 
> > a cpu according to top, but a lot of CPU time is spent
> on
> 
> > igb queues (interrupt or wait). The load is fine when
> we
> 
> > stay below 20Kpps.
> 
> >
> 
> > We see no mbuf shortage, no dropped packet, but there
> is
> 
> > little margin left on CPU time (about 25% idle at best,
> most
> 
> > of CPU time is spent on interrupts), which is
> disturbing.
> 
> >
> 
> > We have done some tuning, but to no avail :
> 
> >
> 
> > sysctl.conf :
> 
> >
> 
> > # mbufs
> 
> > kern.ipc.nmbclusters=65536
> 
> > # Sockets
> 
> > kern.ipc.somaxconn=8192
> 
> > net.inet.tcp.delayed_ack=0
> 
> > net.inet.tcp.sendspace=65535
> 
> > net.inet.udp.recvspace=65535
> 
> > net.inet.udp.maxdgram=57344
> 
> > net.local.stream.recvspace=65535
> 
> > net.local.stream.sendspace=65535
> 
> > # IGB
> 
> > dev.igb.0.rx_processing_limit=4096
> 
> > dev.igb.1.rx_processing_limit=4096
> 
> > dev.igb.2.rx_processing_limit=4096
> 
> > dev.igb.3.rx_processing_limit=4096
> 
> >
> 
> > /boot/loader.conf :
> 
> >
> 
> > vm.kmem_size=1G
> 
> > hw.igb.max_interrupt_rate="32000"  # maximum number
> of
> 
> > interrupts/sec generated by single igb(4) (default
> 8000)
> 
> > hw.igb.txd="2048"           
> 
> >                
> 
> >   # number of transmit descriptors allocated by the
> 
> > driver (2048 limit)
> 
> > hw.igb.rxd="2048"           
> 
> >                
> 
> >   # number of receive descriptors allocated by the
> 
> > driver (2048 limit)
> 
> > hw.igb.rx_process_limit="1000"     #
> 
> > maximum number of received packets to process at a
> time, The
> 
> > default of 100 is
> 
> >                
> 
> >                
> 
> >                
> 
> >            
> 
> >    # too low for most firewalls. (-1 means
> 
> > unlimited)
> 
> >
> 
> > Kernel HZ is 1000.
> 
> >
> 
> > The IGB /boot/loader.conf tuning was our last attempt,
> it
> 
> > didn't change anything.
> 
> >
> 
> > Does anyone have any pointer ? How could we lower CPU
> 
> > interrupt load ? should we set
> hw.igb.max_interrupt_rate
> 
> > lower instead of higher ?
> 
> > From what we saw here and there, we should be able to
> do
> 
> > much better with this hardware.
> 
> >
> 
> >
> 
> > relevant sysctl (igb1 and igb2 only, other interfaces
> are
> 
> > unused) :
> 
> >
> 
> > sysctl dev.igb | grep -v ": 0$"
> 
> > dev.igb.1.%desc: Intel(R) PRO/1000 Network Connection
> 
> > version - 2.3.1
> 
> > dev.igb.1.%driver: igb
> 
> > dev.igb.1.%location: slot=0 function=1
> 
> > dev.igb.1.%pnpinfo: vendor=0x8086 device=0x1521
> 
> > subvendor=0x8086 subdevice=0x5001 class=0x020000
> 
> > dev.igb.1.%parent: pci5
> 
> > dev.igb.1.nvm: -1
> 
> > dev.igb.1.enable_aim: 1
> 
> > dev.igb.1.fc: 3
> 
> > dev.igb.1.rx_processing_limit: 4096
> 
> > dev.igb.1.eee_disabled: 1
> 
> > dev.igb.1.link_irq: 2
> 
> > dev.igb.1.device_control: 1209795137
> 
> > dev.igb.1.rx_control: 67141658
> 
> > dev.igb.1.interrupt_mask: 4
> 
> > dev.igb.1.extended_int_mask: 2147483981
> 
> > dev.igb.1.fc_high_water: 33168
> 
> > dev.igb.1.fc_low_water: 33152
> 
> > dev.igb.1.queue0.interrupt_rate: 71428
> 
> > dev.igb.1.queue0.txd_head: 1318
> 
> > dev.igb.1.queue0.txd_tail: 1318
> 
> > dev.igb.1.queue0.tx_packets: 84663594
> 
> > dev.igb.1.queue0.rxd_head: 717
> 
> > dev.igb.1.queue0.rxd_tail: 715
> 
> > dev.igb.1.queue0.rx_packets: 43899597
> 
> > dev.igb.1.queue0.rx_bytes: 8905556030
> 
> > dev.igb.1.queue1.interrupt_rate: 90909
> 
> > dev.igb.1.queue1.txd_head: 693
> 
> > dev.igb.1.queue1.txd_tail: 693
> 
> > dev.igb.1.queue1.tx_packets: 57543349
> 
> > dev.igb.1.queue1.rxd_head: 1033
> 
> > dev.igb.1.queue1.rxd_tail: 1032
> 
> > dev.igb.1.queue1.rx_packets: 54821897
> 
> > dev.igb.1.queue1.rx_bytes: 9944955108
> 
> > dev.igb.1.queue2.interrupt_rate: 100000
> 
> > dev.igb.1.queue2.txd_head: 350
> 
> > dev.igb.1.queue2.txd_tail: 350
> 
> > dev.igb.1.queue2.tx_packets: 62320990
> 
> > dev.igb.1.queue2.rxd_head: 1962
> 
> > dev.igb.1.queue2.rxd_tail: 1939
> 
> > dev.igb.1.queue2.rx_packets: 43909016
> 
> > dev.igb.1.queue2.rx_bytes: 8673941461
> 
> > dev.igb.1.queue3.interrupt_rate: 14925
> 
> > dev.igb.1.queue3.txd_head: 647
> 
> > dev.igb.1.queue3.txd_tail: 647
> 
> > dev.igb.1.queue3.tx_packets: 58776199
> 
> > dev.igb.1.queue3.rxd_head: 692
> 
> > dev.igb.1.queue3.rxd_tail: 691
> 
> > dev.igb.1.queue3.rx_packets: 55138996
> 
> > dev.igb.1.queue3.rx_bytes: 9310217354
> 
> > dev.igb.1.queue4.interrupt_rate: 100000
> 
> > dev.igb.1.queue4.txd_head: 1721
> 
> > dev.igb.1.queue4.txd_tail: 1721
> 
> > dev.igb.1.queue4.tx_packets: 54337209
> 
> > dev.igb.1.queue4.rxd_head: 1609
> 
> > dev.igb.1.queue4.rxd_tail: 1598
> 
> > dev.igb.1.queue4.rx_packets: 46546503
> 
> > dev.igb.1.queue4.rx_bytes: 8818182840
> 
> > dev.igb.1.queue5.interrupt_rate: 11627
> 
> > dev.igb.1.queue5.txd_head: 254
> 
> > dev.igb.1.queue5.txd_tail: 254
> 
> > dev.igb.1.queue5.tx_packets: 53117182
> 
> > dev.igb.1.queue5.rxd_head: 701
> 
> > dev.igb.1.queue5.rxd_tail: 685
> 
> > dev.igb.1.queue5.rx_packets: 43014837
> 
> > dev.igb.1.queue5.rx_bytes: 8699057447
> 
> > dev.igb.1.queue6.interrupt_rate: 55555
> 
> > dev.igb.1.queue6.txd_head: 8
> 
> > dev.igb.1.queue6.txd_tail: 8
> 
> > dev.igb.1.queue6.tx_packets: 52654088
> 
> > dev.igb.1.queue6.rxd_head: 1057
> 
> > dev.igb.1.queue6.rxd_tail: 1041
> 
> > dev.igb.1.queue6.rx_packets: 45227030
> 
> > dev.igb.1.queue6.rx_bytes: 9494489640
> 
> > dev.igb.1.queue7.interrupt_rate: 5235
> 
> > dev.igb.1.queue7.txd_head: 729
> 
> > dev.igb.1.queue7.txd_tail: 729
> 
> > dev.igb.1.queue7.tx_packets: 61926105
> 
> > dev.igb.1.queue7.rxd_head: 146
> 
> > dev.igb.1.queue7.rxd_tail: 140
> 
> > dev.igb.1.queue7.rx_packets: 51781775
> 
> > dev.igb.1.queue7.rx_bytes: 8901279226
> 
> > dev.igb.1.mac_stats.missed_packets: 1657
> 
> > dev.igb.1.mac_stats.recv_no_buff: 405
> 
> > dev.igb.1.mac_stats.total_pkts_recvd: 384332760
> 
> > dev.igb.1.mac_stats.good_pkts_recvd: 384331103
> 
> > dev.igb.1.mac_stats.bcast_pkts_recvd: 15510
> 
> > dev.igb.1.mac_stats.mcast_pkts_recvd: 52957
> 
> > dev.igb.1.mac_stats.rx_frames_64: 195496498
> 
> > dev.igb.1.mac_stats.rx_frames_65_127: 133346124
> 
> > dev.igb.1.mac_stats.rx_frames_128_255: 5254911
> 
> > dev.igb.1.mac_stats.rx_frames_256_511: 9700049
> 
> > dev.igb.1.mac_stats.rx_frames_512_1023: 16885886
> 
> > dev.igb.1.mac_stats.rx_frames_1024_1522: 23647635
> 
> > dev.igb.1.mac_stats.good_octets_recvd: 74284029276
> 
> > dev.igb.1.mac_stats.good_octets_txd: 544536708502
> 
> > dev.igb.1.mac_stats.total_pkts_txd: 485327419
> 
> > dev.igb.1.mac_stats.good_pkts_txd: 485327419
> 
> > dev.igb.1.mac_stats.bcast_pkts_txd: 72
> 
> > dev.igb.1.mac_stats.mcast_pkts_txd: 52820
> 
> > dev.igb.1.mac_stats.tx_frames_64: 57820809
> 
> > dev.igb.1.mac_stats.tx_frames_65_127: 51586341
> 
> > dev.igb.1.mac_stats.tx_frames_128_255: 7050579
> 
> > dev.igb.1.mac_stats.tx_frames_256_511: 7887126
> 
> > dev.igb.1.mac_stats.tx_frames_512_1023: 10130891
> 
> > dev.igb.1.mac_stats.tx_frames_1024_1522: 350851673
> 
> > dev.igb.1.interrupts.asserts: 551135045
> 
> > dev.igb.1.interrupts.rx_pkt_timer: 384326679
> 
> > dev.igb.1.interrupts.tx_queue_empty: 485323376
> 
> > dev.igb.1.interrupts.tx_queue_min_thresh: 6324386
> 
> > dev.igb.1.host.rx_pkt: 4424
> 
> > dev.igb.1.host.tx_good_pkt: 4043
> 
> > dev.igb.1.host.rx_good_bytes: 74284030864
> 
> > dev.igb.1.host.tx_good_bytes: 544536708502
> 
> > dev.igb.2.%desc: Intel(R) PRO/1000 Network Connection
> 
> > version - 2.3.1
> 
> > dev.igb.2.%driver: igb
> 
> > dev.igb.2.%location: slot=0 function=2
> 
> > dev.igb.2.%pnpinfo: vendor=0x8086 device=0x1521
> 
> > subvendor=0x8086 subdevice=0x5001 class=0x020000
> 
> > dev.igb.2.%parent: pci5
> 
> > dev.igb.2.nvm: -1
> 
> > dev.igb.2.enable_aim: 1
> 
> > dev.igb.2.fc: 3
> 
> > dev.igb.2.rx_processing_limit: 4096
> 
> > dev.igb.2.eee_disabled: 1
> 
> > dev.igb.2.link_irq: 2
> 
> > dev.igb.2.device_control: 1209795137
> 
> > dev.igb.2.rx_control: 67141658
> 
> > dev.igb.2.interrupt_mask: 4
> 
> > dev.igb.2.extended_int_mask: 2147483959
> 
> > dev.igb.2.fc_high_water: 33168
> 
> > dev.igb.2.fc_low_water: 33152
> 
> > dev.igb.2.queue0.interrupt_rate: 13698
> 
> > dev.igb.2.queue0.txd_head: 1618
> 
> > dev.igb.2.queue0.txd_tail: 1618
> 
> > dev.igb.2.queue0.tx_packets: 46401106
> 
> > dev.igb.2.queue0.rxd_head: 831
> 
> > dev.igb.2.queue0.rxd_tail: 827
> 
> > dev.igb.2.queue0.rx_packets: 69356350
> 
> > dev.igb.2.queue0.rx_bytes: 68488772907
> 
> > dev.igb.2.queue1.interrupt_rate: 5405
> 
> > dev.igb.2.queue1.txd_head: 190
> 
> > dev.igb.2.queue1.txd_tail: 190
> 
> > dev.igb.2.queue1.tx_packets: 55965886
> 
> > dev.igb.2.queue1.rxd_head: 268
> 
> > dev.igb.2.queue1.rxd_tail: 256
> 
> > dev.igb.2.queue1.rx_packets: 58958084
> 
> > dev.igb.2.queue1.rx_bytes: 69154569937
> 
> > dev.igb.2.queue2.interrupt_rate: 83333
> 
> > dev.igb.2.queue2.txd_head: 568
> 
> > dev.igb.2.queue2.txd_tail: 568
> 
> > dev.igb.2.queue2.tx_packets: 44974648
> 
> > dev.igb.2.queue2.rxd_head: 371
> 
> > dev.igb.2.queue2.rxd_tail: 219
> 
> > dev.igb.2.queue2.rx_packets: 67037407
> 
> > dev.igb.2.queue2.rx_bytes: 72042326102
> 
> > dev.igb.2.queue3.interrupt_rate: 12658
> 
> > dev.igb.2.queue3.txd_head: 867
> 
> > dev.igb.2.queue3.txd_tail: 867
> 
> > dev.igb.2.queue3.tx_packets: 55962467
> 
> > dev.igb.2.queue3.rxd_head: 85
> 
> > dev.igb.2.queue3.rxd_tail: 1953
> 
> > dev.igb.2.queue3.rx_packets: 60972965
> 
> > dev.igb.2.queue3.rx_bytes: 70397176035
> 
> > dev.igb.2.queue4.interrupt_rate: 90909
> 
> > dev.igb.2.queue4.txd_head: 1920
> 
> > dev.igb.2.queue4.txd_tail: 1920
> 
> > dev.igb.2.queue4.tx_packets: 47660931
> 
> > dev.igb.2.queue4.rxd_head: 1397
> 
> > dev.igb.2.queue4.rxd_tail: 1379
> 
> > dev.igb.2.queue4.rx_packets: 59110758
> 
> > dev.igb.2.queue4.rx_bytes: 68919201478
> 
> > dev.igb.2.queue5.interrupt_rate: 111111
> 
> > dev.igb.2.queue5.txd_head: 886
> 
> > dev.igb.2.queue5.txd_tail: 886
> 
> > dev.igb.2.queue5.tx_packets: 45103990
> 
> > dev.igb.2.queue5.rxd_head: 812
> 
> > dev.igb.2.queue5.rxd_tail: 799
> 
> > dev.igb.2.queue5.rx_packets: 59030312
> 
> > dev.igb.2.queue5.rx_bytes: 69234293962
> 
> > dev.igb.2.queue6.interrupt_rate: 5208
> 
> > dev.igb.2.queue6.txd_head: 1926
> 
> > dev.igb.2.queue6.txd_tail: 1926
> 
> > dev.igb.2.queue6.tx_packets: 46215046
> 
> > dev.igb.2.queue6.rxd_head: 692
> 
> > dev.igb.2.queue6.rxd_tail: 689
> 
> > dev.igb.2.queue6.rx_packets: 58256050
> 
> > dev.igb.2.queue6.rx_bytes: 68429172749
> 
> > dev.igb.2.queue7.interrupt_rate: 50000
> 
> > dev.igb.2.queue7.txd_head: 126
> 
> > dev.igb.2.queue7.txd_tail: 126
> 
> > dev.igb.2.queue7.tx_packets: 52451455
> 
> > dev.igb.2.queue7.rxd_head: 968
> 
> > dev.igb.2.queue7.rxd_tail: 885
> 
> > dev.igb.2.queue7.rx_packets: 65946491
> 
> > dev.igb.2.queue7.rx_bytes: 70263478849
> 
> > dev.igb.2.mac_stats.missed_packets: 958
> 
> > dev.igb.2.mac_stats.recv_no_buff: 69
> 
> > dev.igb.2.mac_stats.total_pkts_recvd: 498658079
> 
> > dev.igb.2.mac_stats.good_pkts_recvd: 498657121
> 
> > dev.igb.2.mac_stats.bcast_pkts_recvd: 16867
> 
> > dev.igb.2.mac_stats.mcast_pkts_recvd: 52957
> 
> > dev.igb.2.mac_stats.rx_frames_64: 59089332
> 
> > dev.igb.2.mac_stats.rx_frames_65_127: 52880118
> 
> > dev.igb.2.mac_stats.rx_frames_128_255: 7526966
> 
> > dev.igb.2.mac_stats.rx_frames_256_511: 8468389
> 
> > dev.igb.2.mac_stats.rx_frames_512_1023: 10434770
> 
> > dev.igb.2.mac_stats.rx_frames_1024_1522: 360257545
> 
> > dev.igb.2.mac_stats.good_octets_recvd: 558910494322
> 
> > dev.igb.2.mac_stats.good_octets_txd: 84618858153
> 
> > dev.igb.2.mac_stats.total_pkts_txd: 394726904
> 
> > dev.igb.2.mac_stats.good_pkts_txd: 394726904
> 
> > dev.igb.2.mac_stats.bcast_pkts_txd: 48
> 
> > dev.igb.2.mac_stats.mcast_pkts_txd: 52821
> 
> > dev.igb.2.mac_stats.tx_frames_64: 196605932
> 
> > dev.igb.2.mac_stats.tx_frames_65_127: 134602807
> 
> > dev.igb.2.mac_stats.tx_frames_128_255: 5705236
> 
> > dev.igb.2.mac_stats.tx_frames_256_511: 10267168
> 
> > dev.igb.2.mac_stats.tx_frames_512_1023: 17165496
> 
> > dev.igb.2.mac_stats.tx_frames_1024_1522: 30380265
> 
> > dev.igb.2.interrupts.asserts: 465994260
> 
> > dev.igb.2.interrupts.rx_pkt_timer: 498647546
> 
> > dev.igb.2.interrupts.tx_queue_empty: 394720657
> 
> > dev.igb.2.interrupts.tx_queue_min_thresh: 24424555
> 
> > dev.igb.2.host.rx_pkt: 9575
> 
> > dev.igb.2.host.tx_good_pkt: 6248
> 
> > dev.igb.2.host.rx_good_bytes: 558910513984
> 
> > dev.igb.2.host.tx_good_bytes: 84618858217
> 
> >
> 
> >
> 
> > Thanks for your help.
> 
> >
> 
> > Cheers,
> 
> 
> 
> You're experiencing lock contention
> 
> 
> 
> Try editing igb.c and setting the queues to 1. The
> multiqueue
> 
> implementation in igb has a negative impact.
> 
> 
> 
> If you have just 1 system get a dual port 82571 card and use
> the
> 
> em driver.
> 
> 
> 
> BC
> 

I'm curious as to why you are going to spend $4000. on hardware rather
than just buying a commercial firewall? You shouldn't need 16 cores to
filter 150Kpps. You can do that with 2 cores.

BC


More information about the freebsd-net mailing list