High CPU interrupt load on intel I350T4 with igb on 8.3
Barney Cordoba
barney_cordoba at yahoo.com
Thu May 9 12:56:56 UTC 2013
--- On Sun, 4/28/13, Barney Cordoba <barney_cordoba at yahoo.com> wrote:
> From: Barney Cordoba <barney_cordoba at yahoo.com>
> Subject: Re: High CPU interrupt load on intel I350T4 with igb on 8.3
> To: "Jack Vogel" <jfvogel at gmail.com>
> Cc: "FreeBSD Net" <freebsd-net at freebsd.org>, "Clément Hermann (nodens)" <nodens2099 at gmail.com>
> Date: Sunday, April 28, 2013, 2:59 PM
> The point of lists is to be able to
> benefit from other's experiences so you don't have to waste
> your time "trying" things that others have already done.
> I'm not pontificating. I've done the tests. There's no
> reason for every person who is having to exact same problem
> to do the same tests over and over, hoping for somemagically
> different result. The result will always be the same.
> Because there's no chance of it working properly by
> chance.
> BC
>
>
> --- On Sun, 4/28/13, Jack Vogel <jfvogel at gmail.com>
> wrote:
>
> From: Jack Vogel <jfvogel at gmail.com>
> Subject: Re: High CPU interrupt load on intel I350T4 with
> igb on 8.3
> To: "Barney Cordoba" <barney_cordoba at yahoo.com>
> Cc: "FreeBSD Net" <freebsd-net at freebsd.org>,
> "Clément Hermann (nodens)" <nodens2099 at gmail.com>
> Date: Sunday, April 28, 2013, 1:07 PM
>
> Try setting your queues to 1, run some tests, then try
> settingyour queues to 2, then to 4... its called tuning, and
> rather thanjust pontificating about it, which Barney so
> loves to do, you can
> discover what works best. I ran tests last week preparing
> for anew driver version and found the best results came not
> only whiletweaking queues, but also ring size, and I could
> see changes based
> on the buf ring size.... There are lots of things that may
> improve ordegrade performance depending on the workload.
> Jack
>
>
>
> On Sun, Apr 28, 2013 at 7:21 AM, Barney Cordoba <barney_cordoba at yahoo.com>
> wrote:
>
>
>
>
>
> --- On Fri, 4/26/13, "Clément Hermann (nodens)" <nodens2099 at gmail.com>
> wrote:
>
>
>
> > From: "Clément Hermann (nodens)" <nodens2099 at gmail.com>
>
> > Subject: High CPU interrupt load on intel I350T4 with
> igb on 8.3
>
> > To: freebsd-net at freebsd.org
>
> > Date: Friday, April 26, 2013, 7:31 AM
>
> > Hi list,
>
> >
>
> > We use pf+ALTQ for trafic shaping on some routers.
>
> >%>
> > We are switching to new servers : Dell PowerEdge R620
> with 2
>
> > 8-cores Intel Processor (E5-2650L), 8GB RAM and Intel
> I350T4
>
> > (quad port) using igb driver. The old hardware is using
> em
>
> > driver, the CPU load is high but mostly due to kernel
> and a
>
> > large pf ruleset.
>
> >
>
> > On the new hardware, we see high CPU Interrupt load (up
> to
>
> > 95%), even though there is not much trafic currently
> (peaks
>
> > about 150Mbps and 40Kpps). All queues are used and
> binded to
>
> > a cpu according to top, but a lot of CPU time is spent
> on
>
> > igb queues (interrupt or wait). The load is fine when
> we
>
> > stay below 20Kpps.
>
> >
>
> > We see no mbuf shortage, no dropped packet, but there
> is
>
> > little margin left on CPU time (about 25% idle at best,
> most
>
> > of CPU time is spent on interrupts), which is
> disturbing.
>
> >
>
> > We have done some tuning, but to no avail :
>
> >
>
> > sysctl.conf :
>
> >
>
> > # mbufs
>
> > kern.ipc.nmbclusters=65536
>
> > # Sockets
>
> > kern.ipc.somaxconn=8192
>
> > net.inet.tcp.delayed_ack=0
>
> > net.inet.tcp.sendspace=65535
>
> > net.inet.udp.recvspace=65535
>
> > net.inet.udp.maxdgram=57344
>
> > net.local.stream.recvspace=65535
>
> > net.local.stream.sendspace=65535
>
> > # IGB
>
> > dev.igb.0.rx_processing_limit=4096
>
> > dev.igb.1.rx_processing_limit=4096
>
> > dev.igb.2.rx_processing_limit=4096
>
> > dev.igb.3.rx_processing_limit=4096
>
> >
>
> > /boot/loader.conf :
>
> >
>
> > vm.kmem_size=1G
>
> > hw.igb.max_interrupt_rate="32000" # maximum number
> of
>
> > interrupts/sec generated by single igb(4) (default
> 8000)
>
> > hw.igb.txd="2048"
>
> >
>
> > # number of transmit descriptors allocated by the
>
> > driver (2048 limit)
>
> > hw.igb.rxd="2048"
>
> >
>
> > # number of receive descriptors allocated by the
>
> > driver (2048 limit)
>
> > hw.igb.rx_process_limit="1000" #
>
> > maximum number of received packets to process at a
> time, The
>
> > default of 100 is
>
> >
>
> >
>
> >
>
> >
>
> > # too low for most firewalls. (-1 means
>
> > unlimited)
>
> >
>
> > Kernel HZ is 1000.
>
> >
>
> > The IGB /boot/loader.conf tuning was our last attempt,
> it
>
> > didn't change anything.
>
> >
>
> > Does anyone have any pointer ? How could we lower CPU
>
> > interrupt load ? should we set
> hw.igb.max_interrupt_rate
>
> > lower instead of higher ?
>
> > From what we saw here and there, we should be able to
> do
>
> > much better with this hardware.
>
> >
>
> >
>
> > relevant sysctl (igb1 and igb2 only, other interfaces
> are
>
> > unused) :
>
> >
>
> > sysctl dev.igb | grep -v ": 0$"
>
> > dev.igb.1.%desc: Intel(R) PRO/1000 Network Connection
>
> > version - 2.3.1
>
> > dev.igb.1.%driver: igb
>
> > dev.igb.1.%location: slot=0 function=1
>
> > dev.igb.1.%pnpinfo: vendor=0x8086 device=0x1521
>
> > subvendor=0x8086 subdevice=0x5001 class=0x020000
>
> > dev.igb.1.%parent: pci5
>
> > dev.igb.1.nvm: -1
>
> > dev.igb.1.enable_aim: 1
>
> > dev.igb.1.fc: 3
>
> > dev.igb.1.rx_processing_limit: 4096
>
> > dev.igb.1.eee_disabled: 1
>
> > dev.igb.1.link_irq: 2
>
> > dev.igb.1.device_control: 1209795137
>
> > dev.igb.1.rx_control: 67141658
>
> > dev.igb.1.interrupt_mask: 4
>
> > dev.igb.1.extended_int_mask: 2147483981
>
> > dev.igb.1.fc_high_water: 33168
>
> > dev.igb.1.fc_low_water: 33152
>
> > dev.igb.1.queue0.interrupt_rate: 71428
>
> > dev.igb.1.queue0.txd_head: 1318
>
> > dev.igb.1.queue0.txd_tail: 1318
>
> > dev.igb.1.queue0.tx_packets: 84663594
>
> > dev.igb.1.queue0.rxd_head: 717
>
> > dev.igb.1.queue0.rxd_tail: 715
>
> > dev.igb.1.queue0.rx_packets: 43899597
>
> > dev.igb.1.queue0.rx_bytes: 8905556030
>
> > dev.igb.1.queue1.interrupt_rate: 90909
>
> > dev.igb.1.queue1.txd_head: 693
>
> > dev.igb.1.queue1.txd_tail: 693
>
> > dev.igb.1.queue1.tx_packets: 57543349
>
> > dev.igb.1.queue1.rxd_head: 1033
>
> > dev.igb.1.queue1.rxd_tail: 1032
>
> > dev.igb.1.queue1.rx_packets: 54821897
>
> > dev.igb.1.queue1.rx_bytes: 9944955108
>
> > dev.igb.1.queue2.interrupt_rate: 100000
>
> > dev.igb.1.queue2.txd_head: 350
>
> > dev.igb.1.queue2.txd_tail: 350
>
> > dev.igb.1.queue2.tx_packets: 62320990
>
> > dev.igb.1.queue2.rxd_head: 1962
>
> > dev.igb.1.queue2.rxd_tail: 1939
>
> > dev.igb.1.queue2.rx_packets: 43909016
>
> > dev.igb.1.queue2.rx_bytes: 8673941461
>
> > dev.igb.1.queue3.interrupt_rate: 14925
>
> > dev.igb.1.queue3.txd_head: 647
>
> > dev.igb.1.queue3.txd_tail: 647
>
> > dev.igb.1.queue3.tx_packets: 58776199
>
> > dev.igb.1.queue3.rxd_head: 692
>
> > dev.igb.1.queue3.rxd_tail: 691
>
> > dev.igb.1.queue3.rx_packets: 55138996
>
> > dev.igb.1.queue3.rx_bytes: 9310217354
>
> > dev.igb.1.queue4.interrupt_rate: 100000
>
> > dev.igb.1.queue4.txd_head: 1721
>
> > dev.igb.1.queue4.txd_tail: 1721
>
> > dev.igb.1.queue4.tx_packets: 54337209
>
> > dev.igb.1.queue4.rxd_head: 1609
>
> > dev.igb.1.queue4.rxd_tail: 1598
>
> > dev.igb.1.queue4.rx_packets: 46546503
>
> > dev.igb.1.queue4.rx_bytes: 8818182840
>
> > dev.igb.1.queue5.interrupt_rate: 11627
>
> > dev.igb.1.queue5.txd_head: 254
>
> > dev.igb.1.queue5.txd_tail: 254
>
> > dev.igb.1.queue5.tx_packets: 53117182
>
> > dev.igb.1.queue5.rxd_head: 701
>
> > dev.igb.1.queue5.rxd_tail: 685
>
> > dev.igb.1.queue5.rx_packets: 43014837
>
> > dev.igb.1.queue5.rx_bytes: 8699057447
>
> > dev.igb.1.queue6.interrupt_rate: 55555
>
> > dev.igb.1.queue6.txd_head: 8
>
> > dev.igb.1.queue6.txd_tail: 8
>
> > dev.igb.1.queue6.tx_packets: 52654088
>
> > dev.igb.1.queue6.rxd_head: 1057
>
> > dev.igb.1.queue6.rxd_tail: 1041
>
> > dev.igb.1.queue6.rx_packets: 45227030
>
> > dev.igb.1.queue6.rx_bytes: 9494489640
>
> > dev.igb.1.queue7.interrupt_rate: 5235
>
> > dev.igb.1.queue7.txd_head: 729
>
> > dev.igb.1.queue7.txd_tail: 729
>
> > dev.igb.1.queue7.tx_packets: 61926105
>
> > dev.igb.1.queue7.rxd_head: 146
>
> > dev.igb.1.queue7.rxd_tail: 140
>
> > dev.igb.1.queue7.rx_packets: 51781775
>
> > dev.igb.1.queue7.rx_bytes: 8901279226
>
> > dev.igb.1.mac_stats.missed_packets: 1657
>
> > dev.igb.1.mac_stats.recv_no_buff: 405
>
> > dev.igb.1.mac_stats.total_pkts_recvd: 384332760
>
> > dev.igb.1.mac_stats.good_pkts_recvd: 384331103
>
> > dev.igb.1.mac_stats.bcast_pkts_recvd: 15510
>
> > dev.igb.1.mac_stats.mcast_pkts_recvd: 52957
>
> > dev.igb.1.mac_stats.rx_frames_64: 195496498
>
> > dev.igb.1.mac_stats.rx_frames_65_127: 133346124
>
> > dev.igb.1.mac_stats.rx_frames_128_255: 5254911
>
> > dev.igb.1.mac_stats.rx_frames_256_511: 9700049
>
> > dev.igb.1.mac_stats.rx_frames_512_1023: 16885886
>
> > dev.igb.1.mac_stats.rx_frames_1024_1522: 23647635
>
> > dev.igb.1.mac_stats.good_octets_recvd: 74284029276
>
> > dev.igb.1.mac_stats.good_octets_txd: 544536708502
>
> > dev.igb.1.mac_stats.total_pkts_txd: 485327419
>
> > dev.igb.1.mac_stats.good_pkts_txd: 485327419
>
> > dev.igb.1.mac_stats.bcast_pkts_txd: 72
>
> > dev.igb.1.mac_stats.mcast_pkts_txd: 52820
>
> > dev.igb.1.mac_stats.tx_frames_64: 57820809
>
> > dev.igb.1.mac_stats.tx_frames_65_127: 51586341
>
> > dev.igb.1.mac_stats.tx_frames_128_255: 7050579
>
> > dev.igb.1.mac_stats.tx_frames_256_511: 7887126
>
> > dev.igb.1.mac_stats.tx_frames_512_1023: 10130891
>
> > dev.igb.1.mac_stats.tx_frames_1024_1522: 350851673
>
> > dev.igb.1.interrupts.asserts: 551135045
>
> > dev.igb.1.interrupts.rx_pkt_timer: 384326679
>
> > dev.igb.1.interrupts.tx_queue_empty: 485323376
>
> > dev.igb.1.interrupts.tx_queue_min_thresh: 6324386
>
> > dev.igb.1.host.rx_pkt: 4424
>
> > dev.igb.1.host.tx_good_pkt: 4043
>
> > dev.igb.1.host.rx_good_bytes: 74284030864
>
> > dev.igb.1.host.tx_good_bytes: 544536708502
>
> > dev.igb.2.%desc: Intel(R) PRO/1000 Network Connection
>
> > version - 2.3.1
>
> > dev.igb.2.%driver: igb
>
> > dev.igb.2.%location: slot=0 function=2
>
> > dev.igb.2.%pnpinfo: vendor=0x8086 device=0x1521
>
> > subvendor=0x8086 subdevice=0x5001 class=0x020000
>
> > dev.igb.2.%parent: pci5
>
> > dev.igb.2.nvm: -1
>
> > dev.igb.2.enable_aim: 1
>
> > dev.igb.2.fc: 3
>
> > dev.igb.2.rx_processing_limit: 4096
>
> > dev.igb.2.eee_disabled: 1
>
> > dev.igb.2.link_irq: 2
>
> > dev.igb.2.device_control: 1209795137
>
> > dev.igb.2.rx_control: 67141658
>
> > dev.igb.2.interrupt_mask: 4
>
> > dev.igb.2.extended_int_mask: 2147483959
>
> > dev.igb.2.fc_high_water: 33168
>
> > dev.igb.2.fc_low_water: 33152
>
> > dev.igb.2.queue0.interrupt_rate: 13698
>
> > dev.igb.2.queue0.txd_head: 1618
>
> > dev.igb.2.queue0.txd_tail: 1618
>
> > dev.igb.2.queue0.tx_packets: 46401106
>
> > dev.igb.2.queue0.rxd_head: 831
>
> > dev.igb.2.queue0.rxd_tail: 827
>
> > dev.igb.2.queue0.rx_packets: 69356350
>
> > dev.igb.2.queue0.rx_bytes: 68488772907
>
> > dev.igb.2.queue1.interrupt_rate: 5405
>
> > dev.igb.2.queue1.txd_head: 190
>
> > dev.igb.2.queue1.txd_tail: 190
>
> > dev.igb.2.queue1.tx_packets: 55965886
>
> > dev.igb.2.queue1.rxd_head: 268
>
> > dev.igb.2.queue1.rxd_tail: 256
>
> > dev.igb.2.queue1.rx_packets: 58958084
>
> > dev.igb.2.queue1.rx_bytes: 69154569937
>
> > dev.igb.2.queue2.interrupt_rate: 83333
>
> > dev.igb.2.queue2.txd_head: 568
>
> > dev.igb.2.queue2.txd_tail: 568
>
> > dev.igb.2.queue2.tx_packets: 44974648
>
> > dev.igb.2.queue2.rxd_head: 371
>
> > dev.igb.2.queue2.rxd_tail: 219
>
> > dev.igb.2.queue2.rx_packets: 67037407
>
> > dev.igb.2.queue2.rx_bytes: 72042326102
>
> > dev.igb.2.queue3.interrupt_rate: 12658
>
> > dev.igb.2.queue3.txd_head: 867
>
> > dev.igb.2.queue3.txd_tail: 867
>
> > dev.igb.2.queue3.tx_packets: 55962467
>
> > dev.igb.2.queue3.rxd_head: 85
>
> > dev.igb.2.queue3.rxd_tail: 1953
>
> > dev.igb.2.queue3.rx_packets: 60972965
>
> > dev.igb.2.queue3.rx_bytes: 70397176035
>
> > dev.igb.2.queue4.interrupt_rate: 90909
>
> > dev.igb.2.queue4.txd_head: 1920
>
> > dev.igb.2.queue4.txd_tail: 1920
>
> > dev.igb.2.queue4.tx_packets: 47660931
>
> > dev.igb.2.queue4.rxd_head: 1397
>
> > dev.igb.2.queue4.rxd_tail: 1379
>
> > dev.igb.2.queue4.rx_packets: 59110758
>
> > dev.igb.2.queue4.rx_bytes: 68919201478
>
> > dev.igb.2.queue5.interrupt_rate: 111111
>
> > dev.igb.2.queue5.txd_head: 886
>
> > dev.igb.2.queue5.txd_tail: 886
>
> > dev.igb.2.queue5.tx_packets: 45103990
>
> > dev.igb.2.queue5.rxd_head: 812
>
> > dev.igb.2.queue5.rxd_tail: 799
>
> > dev.igb.2.queue5.rx_packets: 59030312
>
> > dev.igb.2.queue5.rx_bytes: 69234293962
>
> > dev.igb.2.queue6.interrupt_rate: 5208
>
> > dev.igb.2.queue6.txd_head: 1926
>
> > dev.igb.2.queue6.txd_tail: 1926
>
> > dev.igb.2.queue6.tx_packets: 46215046
>
> > dev.igb.2.queue6.rxd_head: 692
>
> > dev.igb.2.queue6.rxd_tail: 689
>
> > dev.igb.2.queue6.rx_packets: 58256050
>
> > dev.igb.2.queue6.rx_bytes: 68429172749
>
> > dev.igb.2.queue7.interrupt_rate: 50000
>
> > dev.igb.2.queue7.txd_head: 126
>
> > dev.igb.2.queue7.txd_tail: 126
>
> > dev.igb.2.queue7.tx_packets: 52451455
>
> > dev.igb.2.queue7.rxd_head: 968
>
> > dev.igb.2.queue7.rxd_tail: 885
>
> > dev.igb.2.queue7.rx_packets: 65946491
>
> > dev.igb.2.queue7.rx_bytes: 70263478849
>
> > dev.igb.2.mac_stats.missed_packets: 958
>
> > dev.igb.2.mac_stats.recv_no_buff: 69
>
> > dev.igb.2.mac_stats.total_pkts_recvd: 498658079
>
> > dev.igb.2.mac_stats.good_pkts_recvd: 498657121
>
> > dev.igb.2.mac_stats.bcast_pkts_recvd: 16867
>
> > dev.igb.2.mac_stats.mcast_pkts_recvd: 52957
>
> > dev.igb.2.mac_stats.rx_frames_64: 59089332
>
> > dev.igb.2.mac_stats.rx_frames_65_127: 52880118
>
> > dev.igb.2.mac_stats.rx_frames_128_255: 7526966
>
> > dev.igb.2.mac_stats.rx_frames_256_511: 8468389
>
> > dev.igb.2.mac_stats.rx_frames_512_1023: 10434770
>
> > dev.igb.2.mac_stats.rx_frames_1024_1522: 360257545
>
> > dev.igb.2.mac_stats.good_octets_recvd: 558910494322
>
> > dev.igb.2.mac_stats.good_octets_txd: 84618858153
>
> > dev.igb.2.mac_stats.total_pkts_txd: 394726904
>
> > dev.igb.2.mac_stats.good_pkts_txd: 394726904
>
> > dev.igb.2.mac_stats.bcast_pkts_txd: 48
>
> > dev.igb.2.mac_stats.mcast_pkts_txd: 52821
>
> > dev.igb.2.mac_stats.tx_frames_64: 196605932
>
> > dev.igb.2.mac_stats.tx_frames_65_127: 134602807
>
> > dev.igb.2.mac_stats.tx_frames_128_255: 5705236
>
> > dev.igb.2.mac_stats.tx_frames_256_511: 10267168
>
> > dev.igb.2.mac_stats.tx_frames_512_1023: 17165496
>
> > dev.igb.2.mac_stats.tx_frames_1024_1522: 30380265
>
> > dev.igb.2.interrupts.asserts: 465994260
>
> > dev.igb.2.interrupts.rx_pkt_timer: 498647546
>
> > dev.igb.2.interrupts.tx_queue_empty: 394720657
>
> > dev.igb.2.interrupts.tx_queue_min_thresh: 24424555
>
> > dev.igb.2.host.rx_pkt: 9575
>
> > dev.igb.2.host.tx_good_pkt: 6248
>
> > dev.igb.2.host.rx_good_bytes: 558910513984
>
> > dev.igb.2.host.tx_good_bytes: 84618858217
>
> >
>
> >
>
> > Thanks for your help.
>
> >
>
> > Cheers,
>
>
>
> You're experiencing lock contention
>
>
>
> Try editing igb.c and setting the queues to 1. The
> multiqueue
>
> implementation in igb has a negative impact.
>
>
>
> If you have just 1 system get a dual port 82571 card and use
> the
>
> em driver.
>
>
>
> BC
>
I'm curious as to why you are going to spend $4000. on hardware rather
than just buying a commercial firewall? You shouldn't need 16 cores to
filter 150Kpps. You can do that with 2 cores.
BC
More information about the freebsd-net
mailing list