Re: Vector Packet Processing (VPP) portability on FreeBSD

From: Kevin Bowling <kevin.bowling_at_kev009.com>
Date: Mon, 24 May 2021 23:34:16 -0700
The one other thing I want to mention, what this means in effect is every
que ends up limited by EITR on ixgbe (around 30kps with the default
settings) whether it’s a TX or RX workload.  This ends up working ok if you
have sufficient CPU but seems awkward.  On the TX workload we should need a
magnitude less interrupts to do 10g. There was some work to adapt AIM to
this new combined handler but it is not properly tuned and I’m not sure it
should consider TX at all.

Regards,
Kevin

On Mon, May 24, 2021 at 11:16 PM Kevin Bowling <kevin.bowling_at_kev009.com>
wrote:

> I don't fully understand the issue, but in iflib_fast_intr_rxtx
> https://cgit.freebsd.org/src/tree/sys/net/iflib.c#n1581 it seems like
> we end up re-enabling interrupts per course instead of only handling
> spurious cases or some low water threshold (which seems like it would
> be tricky to do here).  The idea is we want to pump interrupts by
> disabling them in the msix_que handler, and then wait to re-enable
> only when we have more work to do in the ift_task grouptask.
>
> It was a lot easier to reason about this with separate TX and RX
> interrupts.  Doing the combined TXRX is definitely a win in terms of
> reducing msi-x vector usage (which is important in a lot of FreeBSD
> use cases), but it's tricky to understand.
>
> My time has been sucked away due to work, so I haven't been looking at
> this problem to the depth I want to.  I'd be interested in discussing
> it further with anyone that is interested in it.
>
> Regards,
> Kevin
>
> On Tue, May 18, 2021 at 2:11 PM Vincenzo Maffione <vmaffione_at_freebsd.org>
> wrote:
> >
> >
> >
> > Il giorno mar 18 mag 2021 alle ore 09:32 Kevin Bowling <
> kevin.bowling_at_kev009.com> ha scritto:
> >>
> >>
> >>
> >> On Mon, May 17, 2021 at 10:20 AM Marko Zec <zec_at_fer.hr> wrote:
> >>>
> >>> On Mon, 17 May 2021 09:53:25 +0000
> >>> Francois ten Krooden <ftk_at_Nanoteq.com> wrote:
> >>>
> >>> > On 2021/05/16 09:22, Vincenzo Maffione wrote:
> >>> >
> >>> > >
> >>> > > Hi,
> >>> > >   Yes, you are not using emulated netmap mode.
> >>> > >
> >>> > >   In the test setup depicted here
> >>> > > https://github.com/ftk-ntq/vpp/wiki/VPP-throughput-using-netmap-
> >>> > > interfaces#test-setup
> >>> > > I think you should really try to replace VPP with the netmap
> >>> > > "bridge" application (tools/tools/netmap/bridge.c), and see what
> >>> > > numbers you get.
> >>> > >
> >>> > > You would run the application this way
> >>> > > # bridge -i ix0 -i ix1
> >>> > > and this will forward any traffic between ix0 and ix1 (in both
> >>> > > directions).
> >>> > >
> >>> > > These numbers would give you a better idea of where to look next
> >>> > > (e.g. VPP code improvements or system tuning such as NIC
> >>> > > interrupts, CPU binding, etc.).
> >>> >
> >>> > Thank you for the suggestion.
> >>> > I did run a test with the bridge this morning, and updated the
> >>> > results as well. +-------------+------------------+
> >>> > | Packet Size | Throughput (pps) |
> >>> > +-------------+------------------+
> >>> > |   64 bytes  |    7.197 Mpps    |
> >>> > |  128 bytes  |    7.638 Mpps    |
> >>> > |  512 bytes  |    2.358 Mpps    |
> >>> > | 1280 bytes  |  964.915 kpps    |
> >>> > | 1518 bytes  |  815.239 kpps    |
> >>> > +-------------+------------------+
> >>>
> >>> I assume you're on 13.0 where netmap throughput is lower compared to
> >>> 11.x due to migration of most drivers to iflib (apparently increased
> >>> overhead) and different driver defaults.  On 11.x I could move 10G line
> >>> rate from one ix to another at low CPU freqs, where on 13.x the CPU
> >>> must be set to max speed, and still can't do 14.88 Mpps.
> >>
> >>
> >> I believe this issue is in the combined txrx interrupt filter.  It is
> causing a bunch of unnecessary tx re-arms.
> >
> >
> > Could you please elaborate on that?
> >
> > TX completion is indeed the one thing that changed considerably with the
> porting to iflib. And this could be a major contributor to the performance
> drop.
> > My understanding is that TX interrupts are not really used anymore on
> multi-gigabit NICs such as ix or ixl. Instead, "softirqs" are used, meaning
> that a timer is used to perform TX completion. I don't know what the
> motivations were for this design decision.
> > I had to decrease the timer period to 90us to ensure timely completion
> (see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=248652). However,
> the timer period is currently not adaptive.
> >
> >
> >>
> >>
> >>>
> >>> #1 thing which changed: default # of packets per ring dropped down from
> >>> 2048 (11.x) to 1024 (13.x).  Try changing this in /boot/loader.conf:
> >>>
> >>> dev.ixl.0.iflib.override_nrxds=2048
> >>> dev.ixl.0.iflib.override_ntxds=2048
> >>> dev.ixl.1.iflib.override_nrxds=2048
> >>> dev.ixl.1.iflib.override_ntxds=2048
> >>> etc.
> >>>
> >>> For me this increases the throughput of
> >>> bridge -i netmap:ixl0 -i netmap:ixl1
> >>> from 9.3 Mpps to 11.4 Mpps
> >>>
> >>> #2: default interrupt moderation delays seem to be too long.  Combined
> >>> with increasing the ring sizes, reducing dev.ixl.0.rx_itr from 62
> >>> (default) to 40 increases the throughput further from 11.4 to 14.5 Mpps
> >>>
> >>> Hope this helps,
> >>>
> >>> Marko
> >>>
> >>>
> >>> > Besides for the 64-byte and 128-byte packets the other sizes where
> >>> > matching the maximum rates possible on 10Gbps. This was when the
> >>> > bridge application was running on a single core, and the cpu core was
> >>> > maxing out at a 100%.
> >>> >
> >>> > I think there might be a bit of system tuning needed, but I suspect
> >>> > most of the improvement would be needed in VPP.
> >>> >
> >>> > Regards
> >>> > Francois
> >>> _______________________________________________
> >>> freebsd-net_at_freebsd.org mailing list
> >>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe_at_freebsd.org"
>
Received on Tue May 25 2021 - 06:34:16 UTC

Original text of this message