High CPU interrupt load on intel I350T4 with igb on 8.3

Barney Cordoba barney_cordoba at yahoo.com
Sun Apr 28 19:01:53 UTC 2013


The point of lists is to be able to benefit from other's experiences so you don't have to waste your time "trying" things that others have already done.
I'm not pontificating. I've done the tests. There's no reason for every person who is having to exact same problem to do the same tests over and over, hoping for somemagically different result. The result will always be the same. Because there's no chance of it working properly by chance.
BC


--- On Sun, 4/28/13, Jack Vogel <jfvogel at gmail.com> wrote:

From: Jack Vogel <jfvogel at gmail.com>
Subject: Re: High CPU interrupt load on intel I350T4 with igb on 8.3
To: "Barney Cordoba" <barney_cordoba at yahoo.com>
Cc: "FreeBSD Net" <freebsd-net at freebsd.org>, "Clément Hermann (nodens)" <nodens2099 at gmail.com>
Date: Sunday, April 28, 2013, 1:07 PM

Try setting your queues to 1, run some tests, then try settingyour queues to 2, then to 4... its called tuning, and rather thanjust pontificating about it, which Barney so loves to do, you can
discover what works best. I ran tests last week preparing for anew driver version and found the best results came not only whiletweaking queues, but also ring size, and I could see changes based
on the buf ring size....  There are lots of things that may improve ordegrade performance depending on the workload.
Jack



On Sun, Apr 28, 2013 at 7:21 AM, Barney Cordoba <barney_cordoba at yahoo.com> wrote:





--- On Fri, 4/26/13, "Clément Hermann (nodens)" <nodens2099 at gmail.com> wrote:



> From: "Clément Hermann (nodens)" <nodens2099 at gmail.com>

> Subject: High CPU interrupt load on intel I350T4 with igb on 8.3

> To: freebsd-net at freebsd.org

> Date: Friday, April 26, 2013, 7:31 AM

> Hi list,

>

> We use pf+ALTQ for trafic shaping on some routers.

>

> We are switching to new servers : Dell PowerEdge R620 with 2

> 8-cores Intel Processor (E5-2650L), 8GB RAM and Intel I350T4

> (quad port) using igb driver. The old hardware is using em

> driver, the CPU load is high but mostly due to kernel and a

> large pf ruleset.

>

> On the new hardware, we see high CPU Interrupt load (up to

> 95%), even though there is not much trafic currently (peaks

> about 150Mbps and 40Kpps). All queues are used and binded to

> a cpu according to top, but a lot of CPU time is spent on

> igb queues (interrupt or wait). The load is fine when we

> stay below 20Kpps.

>

> We see no mbuf shortage, no dropped packet, but there is

> little margin left on CPU time (about 25% idle at best, most

> of CPU time is spent on interrupts), which is disturbing.

>

> We have done some tuning, but to no avail :

>

> sysctl.conf :

>

> # mbufs

> kern.ipc.nmbclusters=65536

> # Sockets

> kern.ipc.somaxconn=8192

> net.inet.tcp.delayed_ack=0

> net.inet.tcp.sendspace=65535

> net.inet.udp.recvspace=65535

> net.inet.udp.maxdgram=57344

> net.local.stream.recvspace=65535

> net.local.stream.sendspace=65535

> # IGB

> dev.igb.0.rx_processing_limit=4096

> dev.igb.1.rx_processing_limit=4096

> dev.igb.2.rx_processing_limit=4096

> dev.igb.3.rx_processing_limit=4096

>

> /boot/loader.conf :

>

> vm.kmem_size=1G

> hw.igb.max_interrupt_rate="32000"  # maximum number of

> interrupts/sec generated by single igb(4) (default 8000)

> hw.igb.txd="2048"           

>                

>   # number of transmit descriptors allocated by the

> driver (2048 limit)

> hw.igb.rxd="2048"           

>                

>   # number of receive descriptors allocated by the

> driver (2048 limit)

> hw.igb.rx_process_limit="1000"     #

> maximum number of received packets to process at a time, The

> default of 100 is

>                

>                

>                

>            

>    # too low for most firewalls. (-1 means

> unlimited)

>

> Kernel HZ is 1000.

>

> The IGB /boot/loader.conf tuning was our last attempt, it

> didn't change anything.

>

> Does anyone have any pointer ? How could we lower CPU

> interrupt load ? should we set hw.igb.max_interrupt_rate

> lower instead of higher ?

> From what we saw here and there, we should be able to do

> much better with this hardware.

>

>

> relevant sysctl (igb1 and igb2 only, other interfaces are

> unused) :

>

> sysctl dev.igb | grep -v ": 0$"

> dev.igb.1.%desc: Intel(R) PRO/1000 Network Connection

> version - 2.3.1

> dev.igb.1.%driver: igb

> dev.igb.1.%location: slot=0 function=1

> dev.igb.1.%pnpinfo: vendor=0x8086 device=0x1521

> subvendor=0x8086 subdevice=0x5001 class=0x020000

> dev.igb.1.%parent: pci5

> dev.igb.1.nvm: -1

> dev.igb.1.enable_aim: 1

> dev.igb.1.fc: 3

> dev.igb.1.rx_processing_limit: 4096

> dev.igb.1.eee_disabled: 1

> dev.igb.1.link_irq: 2

> dev.igb.1.device_control: 1209795137

> dev.igb.1.rx_control: 67141658

> dev.igb.1.interrupt_mask: 4

> dev.igb.1.extended_int_mask: 2147483981

> dev.igb.1.fc_high_water: 33168

> dev.igb.1.fc_low_water: 33152

> dev.igb.1.queue0.interrupt_rate: 71428

> dev.igb.1.queue0.txd_head: 1318

> dev.igb.1.queue0.txd_tail: 1318

> dev.igb.1.queue0.tx_packets: 84663594

> dev.igb.1.queue0.rxd_head: 717

> dev.igb.1.queue0.rxd_tail: 715

> dev.igb.1.queue0.rx_packets: 43899597

> dev.igb.1.queue0.rx_bytes: 8905556030

> dev.igb.1.queue1.interrupt_rate: 90909

> dev.igb.1.queue1.txd_head: 693

> dev.igb.1.queue1.txd_tail: 693

> dev.igb.1.queue1.tx_packets: 57543349

> dev.igb.1.queue1.rxd_head: 1033

> dev.igb.1.queue1.rxd_tail: 1032

> dev.igb.1.queue1.rx_packets: 54821897

> dev.igb.1.queue1.rx_bytes: 9944955108

> dev.igb.1.queue2.interrupt_rate: 100000

> dev.igb.1.queue2.txd_head: 350

> dev.igb.1.queue2.txd_tail: 350

> dev.igb.1.queue2.tx_packets: 62320990

> dev.igb.1.queue2.rxd_head: 1962

> dev.igb.1.queue2.rxd_tail: 1939

> dev.igb.1.queue2.rx_packets: 43909016

> dev.igb.1.queue2.rx_bytes: 8673941461

> dev.igb.1.queue3.interrupt_rate: 14925

> dev.igb.1.queue3.txd_head: 647

> dev.igb.1.queue3.txd_tail: 647

> dev.igb.1.queue3.tx_packets: 58776199

> dev.igb.1.queue3.rxd_head: 692

> dev.igb.1.queue3.rxd_tail: 691

> dev.igb.1.queue3.rx_packets: 55138996

> dev.igb.1.queue3.rx_bytes: 9310217354

> dev.igb.1.queue4.interrupt_rate: 100000

> dev.igb.1.queue4.txd_head: 1721

> dev.igb.1.queue4.txd_tail: 1721

> dev.igb.1.queue4.tx_packets: 54337209

> dev.igb.1.queue4.rxd_head: 1609

> dev.igb.1.queue4.rxd_tail: 1598

> dev.igb.1.queue4.rx_packets: 46546503

> dev.igb.1.queue4.rx_bytes: 8818182840

> dev.igb.1.queue5.interrupt_rate: 11627

> dev.igb.1.queue5.txd_head: 254

> dev.igb.1.queue5.txd_tail: 254

> dev.igb.1.queue5.tx_packets: 53117182

> dev.igb.1.queue5.rxd_head: 701

> dev.igb.1.queue5.rxd_tail: 685

> dev.igb.1.queue5.rx_packets: 43014837

> dev.igb.1.queue5.rx_bytes: 8699057447

> dev.igb.1.queue6.interrupt_rate: 55555

> dev.igb.1.queue6.txd_head: 8

> dev.igb.1.queue6.txd_tail: 8

> dev.igb.1.queue6.tx_packets: 52654088

> dev.igb.1.queue6.rxd_head: 1057

> dev.igb.1.queue6.rxd_tail: 1041

> dev.igb.1.queue6.rx_packets: 45227030

> dev.igb.1.queue6.rx_bytes: 9494489640

> dev.igb.1.queue7.interrupt_rate: 5235

> dev.igb.1.queue7.txd_head: 729

> dev.igb.1.queue7.txd_tail: 729

> dev.igb.1.queue7.tx_packets: 61926105

> dev.igb.1.queue7.rxd_head: 146

> dev.igb.1.queue7.rxd_tail: 140

> dev.igb.1.queue7.rx_packets: 51781775

> dev.igb.1.queue7.rx_bytes: 8901279226

> dev.igb.1.mac_stats.missed_packets: 1657

> dev.igb.1.mac_stats.recv_no_buff: 405

> dev.igb.1.mac_stats.total_pkts_recvd: 384332760

> dev.igb.1.mac_stats.good_pkts_recvd: 384331103

> dev.igb.1.mac_stats.bcast_pkts_recvd: 15510

> dev.igb.1.mac_stats.mcast_pkts_recvd: 52957

> dev.igb.1.mac_stats.rx_frames_64: 195496498

> dev.igb.1.mac_stats.rx_frames_65_127: 133346124

> dev.igb.1.mac_stats.rx_frames_128_255: 5254911

> dev.igb.1.mac_stats.rx_frames_256_511: 9700049

> dev.igb.1.mac_stats.rx_frames_512_1023: 16885886

> dev.igb.1.mac_stats.rx_frames_1024_1522: 23647635

> dev.igb.1.mac_stats.good_octets_recvd: 74284029276

> dev.igb.1.mac_stats.good_octets_txd: 544536708502

> dev.igb.1.mac_stats.total_pkts_txd: 485327419

> dev.igb.1.mac_stats.good_pkts_txd: 485327419

> dev.igb.1.mac_stats.bcast_pkts_txd: 72

> dev.igb.1.mac_stats.mcast_pkts_txd: 52820

> dev.igb.1.mac_stats.tx_frames_64: 57820809

> dev.igb.1.mac_stats.tx_frames_65_127: 51586341

> dev.igb.1.mac_stats.tx_frames_128_255: 7050579

> dev.igb.1.mac_stats.tx_frames_256_511: 7887126

> dev.igb.1.mac_stats.tx_frames_512_1023: 10130891

> dev.igb.1.mac_stats.tx_frames_1024_1522: 350851673

> dev.igb.1.interrupts.asserts: 551135045

> dev.igb.1.interrupts.rx_pkt_timer: 384326679

> dev.igb.1.interrupts.tx_queue_empty: 485323376

> dev.igb.1.interrupts.tx_queue_min_thresh: 6324386

> dev.igb.1.host.rx_pkt: 4424

> dev.igb.1.host.tx_good_pkt: 4043

> dev.igb.1.host.rx_good_bytes: 74284030864

> dev.igb.1.host.tx_good_bytes: 544536708502

> dev.igb.2.%desc: Intel(R) PRO/1000 Network Connection

> version - 2.3.1

> dev.igb.2.%driver: igb

> dev.igb.2.%location: slot=0 function=2

> dev.igb.2.%pnpinfo: vendor=0x8086 device=0x1521

> subvendor=0x8086 subdevice=0x5001 class=0x020000

> dev.igb.2.%parent: pci5

> dev.igb.2.nvm: -1

> dev.igb.2.enable_aim: 1

> dev.igb.2.fc: 3

> dev.igb.2.rx_processing_limit: 4096

> dev.igb.2.eee_disabled: 1

> dev.igb.2.link_irq: 2

> dev.igb.2.device_control: 1209795137

> dev.igb.2.rx_control: 67141658

> dev.igb.2.interrupt_mask: 4

> dev.igb.2.extended_int_mask: 2147483959

> dev.igb.2.fc_high_water: 33168

> dev.igb.2.fc_low_water: 33152

> dev.igb.2.queue0.interrupt_rate: 13698

> dev.igb.2.queue0.txd_head: 1618

> dev.igb.2.queue0.txd_tail: 1618

> dev.igb.2.queue0.tx_packets: 46401106

> dev.igb.2.queue0.rxd_head: 831

> dev.igb.2.queue0.rxd_tail: 827

> dev.igb.2.queue0.rx_packets: 69356350

> dev.igb.2.queue0.rx_bytes: 68488772907

> dev.igb.2.queue1.interrupt_rate: 5405

> dev.igb.2.queue1.txd_head: 190

> dev.igb.2.queue1.txd_tail: 190

> dev.igb.2.queue1.tx_packets: 55965886

> dev.igb.2.queue1.rxd_head: 268

> dev.igb.2.queue1.rxd_tail: 256

> dev.igb.2.queue1.rx_packets: 58958084

> dev.igb.2.queue1.rx_bytes: 69154569937

> dev.igb.2.queue2.interrupt_rate: 83333

> dev.igb.2.queue2.txd_head: 568

> dev.igb.2.queue2.txd_tail: 568

> dev.igb.2.queue2.tx_packets: 44974648

> dev.igb.2.queue2.rxd_head: 371

> dev.igb.2.queue2.rxd_tail: 219

> dev.igb.2.queue2.rx_packets: 67037407

> dev.igb.2.queue2.rx_bytes: 72042326102

> dev.igb.2.queue3.interrupt_rate: 12658

> dev.igb.2.queue3.txd_head: 867

> dev.igb.2.queue3.txd_tail: 867

> dev.igb.2.queue3.tx_packets: 55962467

> dev.igb.2.queue3.rxd_head: 85

> dev.igb.2.queue3.rxd_tail: 1953

> dev.igb.2.queue3.rx_packets: 60972965

> dev.igb.2.queue3.rx_bytes: 70397176035

> dev.igb.2.queue4.interrupt_rate: 90909

> dev.igb.2.queue4.txd_head: 1920

> dev.igb.2.queue4.txd_tail: 1920

> dev.igb.2.queue4.tx_packets: 47660931

> dev.igb.2.queue4.rxd_head: 1397

> dev.igb.2.queue4.rxd_tail: 1379

> dev.igb.2.queue4.rx_packets: 59110758

> dev.igb.2.queue4.rx_bytes: 68919201478

> dev.igb.2.queue5.interrupt_rate: 111111

> dev.igb.2.queue5.txd_head: 886

> dev.igb.2.queue5.txd_tail: 886

> dev.igb.2.queue5.tx_packets: 45103990

> dev.igb.2.queue5.rxd_head: 812

> dev.igb.2.queue5.rxd_tail: 799

> dev.igb.2.queue5.rx_packets: 59030312

> dev.igb.2.queue5.rx_bytes: 69234293962

> dev.igb.2.queue6.interrupt_rate: 5208

> dev.igb.2.queue6.txd_head: 1926

> dev.igb.2.queue6.txd_tail: 1926

> dev.igb.2.queue6.tx_packets: 46215046

> dev.igb.2.queue6.rxd_head: 692

> dev.igb.2.queue6.rxd_tail: 689

> dev.igb.2.queue6.rx_packets: 58256050

> dev.igb.2.queue6.rx_bytes: 68429172749

> dev.igb.2.queue7.interrupt_rate: 50000

> dev.igb.2.queue7.txd_head: 126

> dev.igb.2.queue7.txd_tail: 126

> dev.igb.2.queue7.tx_packets: 52451455

> dev.igb.2.queue7.rxd_head: 968

> dev.igb.2.queue7.rxd_tail: 885

> dev.igb.2.queue7.rx_packets: 65946491

> dev.igb.2.queue7.rx_bytes: 70263478849

> dev.igb.2.mac_stats.missed_packets: 958

> dev.igb.2.mac_stats.recv_no_buff: 69

> dev.igb.2.mac_stats.total_pkts_recvd: 498658079

> dev.igb.2.mac_stats.good_pkts_recvd: 498657121

> dev.igb.2.mac_stats.bcast_pkts_recvd: 16867

> dev.igb.2.mac_stats.mcast_pkts_recvd: 52957

> dev.igb.2.mac_stats.rx_frames_64: 59089332

> dev.igb.2.mac_stats.rx_frames_65_127: 52880118

> dev.igb.2.mac_stats.rx_frames_128_255: 7526966

> dev.igb.2.mac_stats.rx_frames_256_511: 8468389

> dev.igb.2.mac_stats.rx_frames_512_1023: 10434770

> dev.igb.2.mac_stats.rx_frames_1024_1522: 360257545

> dev.igb.2.mac_stats.good_octets_recvd: 558910494322

> dev.igb.2.mac_stats.good_octets_txd: 84618858153

> dev.igb.2.mac_stats.total_pkts_txd: 394726904

> dev.igb.2.mac_stats.good_pkts_txd: 394726904

> dev.igb.2.mac_stats.bcast_pkts_txd: 48

> dev.igb.2.mac_stats.mcast_pkts_txd: 52821

> dev.igb.2.mac_stats.tx_frames_64: 196605932

> dev.igb.2.mac_stats.tx_frames_65_127: 134602807

> dev.igb.2.mac_stats.tx_frames_128_255: 5705236

> dev.igb.2.mac_stats.tx_frames_256_511: 10267168

> dev.igb.2.mac_stats.tx_frames_512_1023: 17165496

> dev.igb.2.mac_stats.tx_frames_1024_1522: 30380265

> dev.igb.2.interrupts.asserts: 465994260

> dev.igb.2.interrupts.rx_pkt_timer: 498647546

> dev.igb.2.interrupts.tx_queue_empty: 394720657

> dev.igb.2.interrupts.tx_queue_min_thresh: 24424555

> dev.igb.2.host.rx_pkt: 9575

> dev.igb.2.host.tx_good_pkt: 6248

> dev.igb.2.host.rx_good_bytes: 558910513984

> dev.igb.2.host.tx_good_bytes: 84618858217

>

>

> Thanks for your help.

>

> Cheers,



You're experiencing lock contention



Try editing igb.c and setting the queues to 1. The multiqueue

implementation in igb has a negative impact.



If you have just 1 system get a dual port 82571 card and use the

em driver.



BC

_______________________________________________

freebsd-net at freebsd.org mailing list

http://lists.freebsd.org/mailman/listinfo/freebsd-net

To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"





More information about the freebsd-net mailing list