Poor high-PPS performance of the 10G ixgbe(9) NIC/driver in FreeBSD 10.1

Luigi Rizzo rizzo at iet.unipi.it
Wed Aug 12 12:23:25 UTC 2015


As I was telling to maxim, you should disable aim because it only matches
the max interrupt rate to the average packet size, which is the last thing
you want.

Setting the interrupt rate with sysctl (one per queue) gives you precise
control on the max rate and (hence, extra latency). 20k interrupts/s give
you 50us of latency, and the 2k slots in the queue are still enough to
absorb a burst of min-sized frames hitting a single queue (the os will
start dropping long before that level, but that's another story).

Cheers
Luigi

On Wednesday, August 12, 2015, Babak Farrokhi <farrokhi at freebsd.org> wrote:

> I ran into the same problem with almost the same hardware (Intel X520)
> on 10-STABLE. HT/SMT is disabled and cards are configured with 8 queues,
> with the same sysctl tunings as sobomax@ did. I am not using lagg, no
> FLOWTABLE.
>
> I experimented with pmcstat (RESOURCE_STALLS) a while ago and here [1]
> [2] you can see the results, including pmc output, callchain, flamegraph
> and gprof output.
>
> I am experiencing huge number of interrupts with 200kpps load:
>
> # sysctl dev.ix | grep interrupt_rate
> dev.ix.1.queue7.interrupt_rate: 125000
> dev.ix.1.queue6.interrupt_rate: 6329
> dev.ix.1.queue5.interrupt_rate: 500000
> dev.ix.1.queue4.interrupt_rate: 100000
> dev.ix.1.queue3.interrupt_rate: 50000
> dev.ix.1.queue2.interrupt_rate: 500000
> dev.ix.1.queue1.interrupt_rate: 500000
> dev.ix.1.queue0.interrupt_rate: 100000
> dev.ix.0.queue7.interrupt_rate: 500000
> dev.ix.0.queue6.interrupt_rate: 6097
> dev.ix.0.queue5.interrupt_rate: 10204
> dev.ix.0.queue4.interrupt_rate: 5208
> dev.ix.0.queue3.interrupt_rate: 5208
> dev.ix.0.queue2.interrupt_rate: 71428
> dev.ix.0.queue1.interrupt_rate: 5494
> dev.ix.0.queue0.interrupt_rate: 6250
>
> [1] http://farrokhi.net/~farrokhi/pmc/6/
> [2] http://farrokhi.net/~farrokhi/pmc/7/
>
> Regards,
> Babak
>
>
> Alexander V. Chernikov wrote:
> > 12.08.2015, 02:28, "Maxim Sobolev" <sobomax at FreeBSD.org>:
> >> Olivier, keep in mind that we are not "kernel forwarding" packets, but
> "app
> >> forwarding", i.e. the packet goes full way
> >> net->kernel->recvfrom->app->sendto->kernel->net, which is why we have
> much
> >> lower PPS limits and which is why I think we are actually benefiting
> from
> >> the extra queues. Single-thread sendto() in a loop is CPU-bound at about
> >> 220K PPS, and while running the test I am observing that outbound
> traffic
> >> from one thread is mapped into a specific queue (well, pair of queues on
> >> two separate adaptors, due to lagg load balancing action). And the peak
> >> performance of that test is at 7 threads, which I believe corresponds to
> >> the number of queues. We have plenty of CPU cores in the box (24) with
> >> HTT/SMT disabled and one CPU is mapped to a specific queue. This leaves
> us
> >> with at least 8 CPUs fully capable of running our app. If you look at
> the
> >> CPU utilization, we are at about 10% when the issue hits.
> >
> > In any case, it would be great if you could provide some profiling info
> since there could be
> > plenty of problematic places starting from TX rings contention to some
> locks inside udp or even
> > (in)famous random entropy harvester..
> > e.g. something like pmcstat -TS instructions -w1 might be sufficient to
> determine the reason
> >> ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 2.5.15>
> port
> >> 0x6020-0x603f mem 0xc7c00000-0xc7dfffff,0xc7e04000-0xc7e07fff irq 40 at
> >> device 0.0 on pci3
> >> ix0: Using MSIX interrupts with 9 vectors
> >> ix0: Bound queue 0 to cpu 0
> >> ix0: Bound queue 1 to cpu 1
> >> ix0: Bound queue 2 to cpu 2
> >> ix0: Bound queue 3 to cpu 3
> >> ix0: Bound queue 4 to cpu 4
> >> ix0: Bound queue 5 to cpu 5
> >> ix0: Bound queue 6 to cpu 6
> >> ix0: Bound queue 7 to cpu 7
> >> ix0: Ethernet address: 0c:c4:7a:5e:be:64
> >> ix0: PCI Express Bus: Speed 5.0GT/s Width x8
> >> 001.000008 [2705] netmap_attach success for ix0 tx 8/4096 rx
> >> 8/4096 queues/slots
> >> ix1: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 2.5.15>
> port
> >> 0x6000-0x601f mem 0xc7a00000-0xc7bfffff,0xc7e00000-0xc7e03fff irq 44 at
> >> device 0.1 on pci3
> >> ix1: Using MSIX interrupts with 9 vectors
> >> ix1: Bound queue 0 to cpu 8
> >> ix1: Bound queue 1 to cpu 9
> >> ix1: Bound queue 2 to cpu 10
> >> ix1: Bound queue 3 to cpu 11
> >> ix1: Bound queue 4 to cpu 12
> >> ix1: Bound queue 5 to cpu 13
> >> ix1: Bound queue 6 to cpu 14
> >> ix1: Bound queue 7 to cpu 15
> >> ix1: Ethernet address: 0c:c4:7a:5e:be:65
> >> ix1: PCI Express Bus: Speed 5.0GT/s Width x8
> >> 001.000009 [2705] netmap_attach success for ix1 tx 8/4096 rx
> >> 8/4096 queues/slots
> >>
> >> On Tue, Aug 11, 2015 at 4:14 PM, Olivier Cochard-Labbé <
> olivier at cochard.me <javascript:;>>
> >> wrote:
> >>
> >>>  On Tue, Aug 11, 2015 at 11:18 PM, Maxim Sobolev <sobomax at freebsd.org
> <javascript:;>>
> >>>  wrote:
> >>>
> >>>>  Hi folks,
> >>>>
> >>>>  ​Hi,
> >>>  ​
> >>>
> >>>>  We've trying to migrate some of our high-PPS systems to a new
> hardware
> >>>>  that
> >>>>  has four X540-AT2 10G NICs and observed that interrupt time goes
> through
> >>>>  roof after we cross around 200K PPS in and 200K out (two ports in
> LACP).
> >>>>  The previous hardware was stable up to about 350K PPS in and 350K
> out. I
> >>>>  believe the old one was equipped with the I350 and had the identical
> LACP
> >>>>  configuration. The new box also has better CPU with more cores (i.e.
> 24
> >>>>  cores vs. 16 cores before). CPU itself is 2 x E5-2690 v3.
> >>>  ​200K PPS, and even 350K PPS are very low value indeed.
> >>>  On a Intel Xeon L5630 (4 cores only) with one X540-AT2​
> >>>
> >>>  ​(then 2 10Gigabit ports)​ I've reached about 1.8Mpps (fastforwarding
> >>>  enabled) [1].
> >>>  But my setup didn't use lagg(4): Can you disable lagg configuration
> and
> >>>  re-measure your performance without lagg ?
> >>>
> >>>  Do you let Intel NIC drivers using 8 queues for port too?
> >>>  In my use case (forwarding smallest UDP packet size), I obtain better
> >>>  behaviour by limiting NIC queues to 4 (hw.ix.num_queues or
> >>>  hw.ixgbe.num_queues, don't remember) if my system had 8 cores. And
> this
> >>>  with Gigabit Intel[2] or Chelsio NIC [3].
> >>>
> >>>  Don't forget to disable TSO and LRO too.
> >>>
> >>>  ​Regards,
> >>>
> >>>  Olivier
> >>>
> >>>  [1]
> >>>
> http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_an_ibm_system_x3550_m3_with_10-gigabit_intel_x540-at2#graphs
> >>>  [2]
> >>>
> http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_a_superserver_5018a-ftn4#graph1
> >>>  [3]
> >>>
> http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_a_hp_proliant_dl360p_gen8_with_10-gigabit_with_10-gigabit_chelsio_t540-cr#reducing_nic_queues
> >> _______________________________________________
> >> freebsd-net at freebsd.org <javascript:;> mailing list
> >> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> >> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org
> <javascript:;>"
> > _______________________________________________
> > freebsd-net at freebsd.org <javascript:;> mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org
> <javascript:;>"
> _______________________________________________
> freebsd-net at freebsd.org <javascript:;> mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org
> <javascript:;>"



-- 
-----------------------------------------+-------------------------------
 Prof. Luigi RIZZO, rizzo at iet.unipi.it  . Dip. di Ing. dell'Informazione
 http://www.iet.unipi.it/~luigi/        . Universita` di Pisa
 TEL      +39-050-2217533               . via Diotisalvi 2
 Mobile   +39-338-6809875               . 56122 PISA (Italy)
-----------------------------------------+-------------------------------


More information about the freebsd-net mailing list