bad throughput performance on multiple systems: Re: Fwd: Re: Disappointing packets-per-second performance results on a Dell,PE R530

John Jasen jjasen at gmail.com
Thu Mar 16 19:50:44 UTC 2017


As a few points of note, partial resolution, and curiosity:

Following down leads that 11-STABLE had tryforward improvements over
11-RELENG, I upgraded. The same tests (24 client streams over UDP with
small packets), the system went from passing 1.7m pps to about 2.5m.

Following indications from Navdeep Parhar that UDP queue hashing is not as
efficient as it could be, we started running the tests with various powers
of 2 streams (2,4,8,16,32) -- and were able to push the system up to 5m pps.

We are currently seeing in the tests approximately 10-11m pps on the
outside interface, around 5-6m dropped, and 5 million passed.


On Mon, Mar 13, 2017 at 1:31 PM, Navdeep Parhar <nparhar at gmail.com> wrote:

> On Mon, Mar 13, 2017 at 10:13 AM, John Jasen <jjasen at gmail.com> wrote:
> > On 03/13/2017 01:03 PM, Navdeep Parhar wrote:
> >
> >> On Sun, Mar 12, 2017 at 5:35 PM, John Jasen <jjasen at gmail.com> wrote:
> >>> UDP traffic. dmesg reports 16 txq, 8 rxq -- which is the default for
> >>> Chelsio.
> >>>
> >> I don't recall offhand, but UDP might be using 2-tuple hashing by
> >> default and that might affect the distribution of flows across queues.
> >> Are there senders generating IP fragments by any chance (that'll
> >> depend on the "send size" that your UDP application is using)?
> >
> > No, they're not fragmenting.
> >
> >> Have you tried limiting the adapter's rx ithreads to the CPU that the
> >> PCIe slot with the adapter is wired to?
> >
> > Above and beyond the use of cpuset, you mean?
>
> I meant cpuset.
>
> If possible, try your experiments on a single socket system.
>
> Regards,
> Navdeep
>


More information about the freebsd-net mailing list