Poor high-PPS performance of the 10G ixgbe(9) NIC/driver in FreeBSD 10.1

Barney Cordoba barney_cordoba at yahoo.com
Wed Aug 12 01:40:17 UTC 2015


Also, using a slow-ass cpu like the atom is completely absurd; first, no-one would ever use them. 
You have to test cpu usage under 60% cpu usage, because as you get to higher cpu usage levels the lock contention increases exponentially. You're increasing lock contention by having more queues; so more queues at higher cpu % usage will perform increasingly bad as usage increases.
You'd never run a system at 95% usage (ie totally hammering it) in real world usage, so why would you benchmark at such a high usage? Everything changes as cpu available become scarce. 
"What is the pps at 50% cpu usage" is a better question to ask than the one you're asking.
BC 


     On Tuesday, August 11, 2015 9:29 PM, Barney Cordoba via freebsd-net <freebsd-net at freebsd.org> wrote:
   

 Wow, this is really important! if this is a college project, I give you a D. Maybe a D- because it's almost useless information.
You ignore the most important aspect of "performance". Efficiency is arguably the most important aspect of performance. 
1M pps at 20% cpu usage is much better "performance" than 1.2M pps at 85%. 
Why don't any of you understand this simple thing? Why does spreading equality really matter, unless you are hitting a wall with your cpus? I don't care which cpu processes which packet. If you weren't doing moronic things like binding to a cpu, then you'd never have to care about distribution unless it was extremely unbalanced.
BC 


    On Tuesday, August 11, 2015 7:15 PM, Olivier Cochard-Labbé <olivier at cochard.me> wrote:
  

 On Tue, Aug 11, 2015 at 11:18 PM, Maxim Sobolev <sobomax at freebsd.org> wrote:

> Hi folks,
>
> ​Hi,
​


> We've trying to migrate some of our high-PPS systems to a new hardware that
> has four X540-AT2 10G NICs and observed that interrupt time goes through
> roof after we cross around 200K PPS in and 200K out (two ports in LACP).
> The previous hardware was stable up to about 350K PPS in and 350K out. I
> believe the old one was equipped with the I350 and had the identical LACP
> configuration. The new box also has better CPU with more cores (i.e. 24
> cores vs. 16 cores before). CPU itself is 2 x E5-2690 v3.
>

​200K PPS, and even 350K PPS are very low value indeed.
On a Intel Xeon L5630 (4 cores only) with one X540-AT2​

​(then 2 10Gigabit ports)​ I've reached about 1.8Mpps (fastforwarding
enabled) [1].
But my setup didn't use lagg(4): Can you disable lagg configuration and
re-measure your performance without lagg ?

Do you let Intel NIC drivers using 8 queues for port too?
In my use case (forwarding smallest UDP packet size), I obtain better
behaviour by limiting NIC queues to 4 (hw.ix.num_queues or
hw.ixgbe.num_queues, don't remember) if my system had 8 cores. And this
with Gigabit Intel[2] or Chelsio NIC [3].

Don't forget to disable TSO and LRO too.

​Regards,

Olivier

[1]
http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_an_ibm_system_x3550_m3_with_10-gigabit_intel_x540-at2#graphs
[2]
http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_a_superserver_5018a-ftn4#graph1
[3]
http://bsdrp.net/documentation/examples/forwarding_performance_lab_of_a_hp_proliant_dl360p_gen8_with_10-gigabit_with_10-gigabit_chelsio_t540-cr#reducing_nic_queues
_______________________________________________
freebsd-net at freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"

  
_______________________________________________
freebsd-net at freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"

  


More information about the freebsd-net mailing list