Poor high-PPS performance of the 10G ixgbe(9) NIC/driver in FreeBSD 10.1

Maxim Sobolev sobomax at FreeBSD.org
Wed Aug 12 01:46:46 UTC 2015


Thanks Barney for totally useless response and an attempted insult! And
yes, we are hitting CPU limit on 12 core  E5-2620 v2 systems running I350,
so yes, we do know a little bit how to distribute our application at least
with the igb. For some reason this does not work with ixgb and we are
trying to understand why is it so.

http://sobomax.sippysoft.com/ScreenShot388.png

On Tue, Aug 11, 2015 at 6:28 PM, Barney Cordoba <barney_cordoba at yahoo.com>
wrote:

> Wow, this is really important! if this is a college project, I give you a
> D. Maybe a D- because it's almost useless information.
>
> You ignore the most important aspect of "performance". Efficiency is
> arguably the most important aspect of performance.
>
> 1M pps at 20% cpu usage is much better "performance" than 1.2M pps at 85%.
>
> Why don't any of you understand this simple thing? Why does spreading
> equality really matter, unless you are hitting a wall with your cpus? I
> don't care which cpu processes which packet. If you weren't doing moronic
> things like binding to a cpu, then you'd never have to care about
> distribution unless it was extremely unbalanced.
>
> BC
>
>
>
> On Tuesday, August 11, 2015 7:15 PM, Olivier Cochard-Labbé <
> olivier at cochard.me> wrote:
>
>
> On Tue, Aug 11, 2015 at 11:18 PM, Maxim Sobolev <sobomax at freebsd.org>
> wrote:
>
> > Hi folks,
> >
> > ​Hi,
>>
>
> > We've trying to migrate some of our high-PPS systems to a new hardware
> that
> > has four X540-AT2 10G NICs and observed that interrupt time goes through
> > roof after we cross around 200K PPS in and 200K out (two ports in LACP).
> > The previous hardware was stable up to about 350K PPS in and 350K out. I
> > believe the old one was equipped with the I350 and had the identical LACP
> > configuration. The new box also has better CPU with more cores (i.e. 24
> > cores vs. 16 cores before). CPU itself is 2 x E5-2690 v3.
> >
>
> ​200K PPS, and even 350K PPS are very low value indeed.
> On a Intel Xeon L5630 (4 cores only) with one X540-AT2​
>
> ​(then 2 10Gigabit ports)​ I've reached about 1.8Mpps (fastforwarding
> enabled) [1].
> But my setup didn't use lagg(4): Can you disable lagg configuration and
> re-measure your performance without lagg ?
>
> Do you let Intel NIC drivers using 8 queues for port too?
> In my use case (forwarding smallest UDP packet size), I obtain better
> behaviour by limiting NIC queues to 4 (hw.ix.num_queues or
> hw.ixgbe.num_queues, don't remember) if my system had 8 cores. And this
> with Gigabit Intel[2] or Chelsio NIC [3].
>
> Don't forget to disable TSO and LRO too.
>
> ​Regards,
>
> Olivier
>
> [1]
>
> http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_an_ibm_system_x3550_m3_with_10-gigabit_intel_x540-at2#graphs
> [2]
>
> http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_a_superserver_5018a-ftn4#graph1
> [3]
>
> http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_a_hp_proliant_dl360p_gen8_with_10-gigabit_with_10-gigabit_chelsio_t540-cr#reducing_nic_queues
>
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>
>
>


More information about the freebsd-net mailing list