bad throughput performance on multiple systems: Re: Fwd: Re: Disappointing packets-per-second performance results on a Dell,PE R530

Slawa Olhovchenkov slw at zxy.spb.ru
Fri Mar 17 10:35:40 UTC 2017


On Thu, Mar 16, 2017 at 03:50:42PM -0400, John Jasen wrote:

> As a few points of note, partial resolution, and curiosity:
> 
> Following down leads that 11-STABLE had tryforward improvements over
> 11-RELENG, I upgraded. The same tests (24 client streams over UDP with
> small packets), the system went from passing 1.7m pps to about 2.5m.
> 
> Following indications from Navdeep Parhar that UDP queue hashing is not as
> efficient as it could be, we started running the tests with various powers
> of 2 streams (2,4,8,16,32) -- and were able to push the system up to 5m pps.
> 
> We are currently seeing in the tests approximately 10-11m pps on the
> outside interface, around 5-6m dropped, and 5 million passed.

You want more?

> On Mon, Mar 13, 2017 at 1:31 PM, Navdeep Parhar <nparhar at gmail.com> wrote:
> 
> > On Mon, Mar 13, 2017 at 10:13 AM, John Jasen <jjasen at gmail.com> wrote:
> > > On 03/13/2017 01:03 PM, Navdeep Parhar wrote:
> > >
> > >> On Sun, Mar 12, 2017 at 5:35 PM, John Jasen <jjasen at gmail.com> wrote:
> > >>> UDP traffic. dmesg reports 16 txq, 8 rxq -- which is the default for
> > >>> Chelsio.
> > >>>
> > >> I don't recall offhand, but UDP might be using 2-tuple hashing by
> > >> default and that might affect the distribution of flows across queues.
> > >> Are there senders generating IP fragments by any chance (that'll
> > >> depend on the "send size" that your UDP application is using)?
> > >
> > > No, they're not fragmenting.
> > >
> > >> Have you tried limiting the adapter's rx ithreads to the CPU that the
> > >> PCIe slot with the adapter is wired to?
> > >
> > > Above and beyond the use of cpuset, you mean?
> >
> > I meant cpuset.
> >
> > If possible, try your experiments on a single socket system.
> >
> > Regards,
> > Navdeep
> >


More information about the freebsd-net mailing list