ixgbe: Network performance tuning (#TCP connections)

Meyer, Wolfgang wolfgang.meyer at hob.de
Fri Feb 5 18:05:46 UTC 2016



> -----Original Message-----
> From: owner-freebsd-performance at freebsd.org [mailto:owner-freebsd-
> performance at freebsd.org] On Behalf Of K. Macy
> Sent: Mittwoch, 3. Februar 2016 20:31
> To: Allan Jude
> Cc: freebsd-performance at freebsd.org
> Subject: Re: ixgbe: Network performance tuning (#TCP connections)
>
> Also - check for txq overruns and rxq drops in sysctl. 64 is very low on
> FreeBSD. You may also look in to increasing the size of your pcb hash table.
>
>
>
> On Wed, Feb 3, 2016 at 9:50 AM, Allan Jude <allanjude at freebsd.org> wrote:
> > On 2016-02-03 08:37, Meyer, Wolfgang wrote:
> >> Hello,
> >>
> >> we are evaluating network performance on a DELL-Server (PowerEdge
> R930 with 4 Sockets, hw.model: Intel(R) Xeon(R) CPU E7-8891 v3 @ 2.80GHz)
> with 10 GbE-Cards. We use programs that on server side accepts connections
> on a IP-address+port from the client side and after establishing the
> connection data is sent in turns between server and client in a predefined
> pattern (server side sends more data than client side) with sleeps in
> between the send phases. The test set-up is chosen in such way that every
> client process initiates 500 connections handled in threads and on the server
> side each process representing an IP/Port pair also handles 500 connections
> in threads.
> >>
> >> The number of connections is then increased and the overall network
> througput is observed using nload. On FreeBSD (on server side) roughly at
> 50,000 connections errors begin to occur and the overall throughput won't
> increase further with more connections. With Linux on the server side it is
> possible to establish more than 120,000 connections and at 50,000
> connections the overall throughput ist double that of FreeBSD with the same
> sending pattern. Furthermore system load on FreeBSD is much higher with
> 50 % system usage on each core and 80 % interrupt usage on the 8 cores
> handling the interrupt queues for the NIC. In comparison Linux has <10 %
> system usage, <10 % user usage and about 15 % interrupt usage on the 16
> cores handling the network interrupts for 50,000 connections.
> >>
> >> Varying the numbers for the NIC interrupt queues won't change the
> performance (rather worsens the situation). Disabling Hyperthreading
> (utilising 40 cores) degrades the performance. Increasing MAXCPU to utilise
> all 80 cores won't improve compared to 64 cores, atkbd and uart had to be
> disabled to avoid kernel panics with increased MAXCPU (thanks to Andre
> Oppermann for investigating this). Initiallly the tests were made on 10.2
> Release, later I switched to 10 Stable (later with ixgbe driver version 3.1.0)
> but that didn't change the numbers.
> >>
> >> Some sysctl configurables were modified along the network performance
> guidelines found on the net (e.g.
> https://calomel.org/freebsd_network_tuning.html,
> https://www.freebsd.org/doc/handbook/configtuning-kernel-limits.html,
> https://pleiades.ucsc.edu/hyades/FreeBSD_Network_Tuning) but most of
> them didn't have any measuarable impact. Final sysctl.conf and loader.conf
> settings see below. Actually the only tunables that provided any
> improvement were identified to be hw.ix.txd, and hw.ix.rxd that were
> reduced (!) to the minimum value of 64 and hw.ix.tx_process_limit and
> hw.ix.rx_process_limit that were set to -1.
> >>
> >> Any ideas what tunables might be changed to get a higher number of TCP
> connections (it's not a question of the overall throughput as changing the
> sending pattern allows me to fully utilise the 10Gb bandwidth)? How can I
> determine where the kernel is spending its time that causes the high CPU
> load? Any pointers are highly appreciated, I can't believe that there is such a
> blatant difference in network performance compared to Linux.
> >>
> >> Regards,
> >> Wolfgang
> >>
> >
> > I wonder if this might be NUMA related. Specifically, it might help to
> > make sure that the 8 CPU cores that the NIC queues are pinned to, are
> > on the same CPU that is backing the PCI-E slot that the NIC is in.
> >
> >
> > --
> > Allan Jude
> >


As I was telling in my original message, the rxd and txd values were more or less the only ones that changed my numbers to the better when reducing them. Not that I understood that behaviour but a double-check now revealed that I stand corrected on this observation. Raising the value (to 1024) now did not only degrade througput to my original bad numbers but to the opposite slightly improved it (but only barely measurable compared to measurement variation). Don't know what cross interaction was leading to my original observation.

Concerning pcb hash table size I only found net.inet.sctp.pcbhashsize and that had no influence. Not sure whether sctp plays a role at all in my problem.

Regards,
Wolfgang Meyer


________________________________

Follow HOB:

- HOB: http://www.hob.de/redirect/hob.html
- Xing: http://www.hob.de/redirect/xing.html
- LinkedIn: http://www.hob.de/redirect/linkedin.html
- HOBLink Mobile: http://www.hob.de/redirect/hoblinkmobile.html
- Facebook: http://www.hob.de/redirect/facebook.html
- Twitter: http://www.hob.de/redirect/twitter.html
- YouTube: http://www.hob.de/redirect/youtube.html
- E-Mail: http://www.hob.de/redirect/mail.html


HOB GmbH & Co. KG
Schwadermuehlstr. 3
D-90556 Cadolzburg

Geschaeftsfuehrung: Klaus Brandstaetter, Zoran Adamovic

AG Fuerth, HRA 5180
Steuer-Nr. 218/163/00107
USt-ID-Nr. DE 132747002

Komplementaerin HOB electronic Beteiligungs GmbH
AG Fuerth, HRB 3416


More information about the freebsd-performance mailing list