Re: git: 1a72c3d76aea - stable/13 - e1000: always enable PCSD when RSS hashing [Was: TCP6 regression for MTU path on stable/13]

From: Kevin Bowling <kevin.bowling_at_kev009.com>
Date: Sun, 26 Sep 2021 02:59:23 UTC
On Sat, Sep 25, 2021 at 5:53 PM Harry Schmalzbauer <freebsd@omnilan.de> wrote:
>
> Am 13.09.2021 um 13:18 schrieb Harry Schmalzbauer:
> > Am 13.09.2021 um 12:37 schrieb Andrey V. Elsukov:
> >> 12.09.2021 14:12, Harry Schmalzbauer пишет:
> >>> Will try to further track it down, but in case anybody has an idea,
> >>> what
> >>> change during the last view months in stable/13 could have caused this
> >>> real-world problem regarding resulting TCP6 throughput, I'm happy to
> >>> start testing at that point.
> >>
> >> Hi,
> >>
> >> Take a look at:
> >>
> >>    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=255749
> >>    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=248005
> >>
> >> does the problem described in these PRs is the same as yours?
> >
> > Hi, thank you very much for your attention!
> > Most likely these are unrelated to the regression I'm suffering from,
> > because these affect 13-release and earlier.
> > Mine arose during the last months.
> > And it seems not to be a jumbo frame problem.
> :
> > Hope to get back to you soon with more info.
>
>
> Since the setup was hard to replicate, it took some time.
> Here's the commit, causing the heavy IPv6 performance drop with Intel
> Powerville and IPv6:
>
> > The branch stable/13 has been updated by kbowling (ports committer):
> >
> > URL:
> > https://cgit.FreeBSD.org/src/commit/?id=1a72c3d76aeafe4422ff20f81c4142efb983b7d7
> >
> > commit 1a72c3d76aeafe4422ff20f81c4142efb983b7d7
> > Author:     Kevin Bowling <kbowling@FreeBSD.org>
> > AuthorDate: 2021-08-16 17:17:34 +0000
> > Commit:     Kevin Bowling <kbowling@FreeBSD.org>
> > CommitDate: 2021-08-23 16:23:43 +0000
> >
> >     e1000: always enable PCSD when RSS hashing
> >
> >     To enable RSS hashing in the NIC, the PCSD bit must be set.
> >
> >     By default, this is never set when RXCSUM is disabled - which
> >     causes problems higher up in the stack.
> >
> >     While here improve the RXCSUM flag assignments when enabling or
> >     disabling IFCAP_RXCSUM.
> >
> >     See also:
> > https://lists.freebsd.org/pipermail/freebsd-current/2020-May/076148.html
> >
> >     Reviewed by:    markj, Franco Fichtner <franco@opnsense.org>,
> >                     Stephan de Wit <stephan.dewt@yahoo.co.uk>
> >     Obtained from:  OPNsense
> >     MFC after:      1 week
> >     Differential Revision:  https://reviews.freebsd.org/D31501
> >     Co-authored-by: Stephan de Wit <stephan.dewt@yahoo.co.uk>
> >     Co-authored-by: Franco Fichtner <franco@opnsense.org>
> >
> >     (cherry picked from commit 69e8e8ea3d4be9da6b5bc904a444b51958128ff5)
> > :
>
> Noticed and successfully (double-{a8446d412+f72cdea25}) falsified with
> i350 Powerville, device=0x1521.
> *Reverting git: 1a72c3d76aea against today's stable/13(-f72cdea25-dirty)
> sloves the issue, which seems to be IPv6 related only.*
> (kernel  a8446d412 from 21/09/25 shows issue, reverting this commit
> solves it with old kernel too)
>
> Very brief check against IPv4 on identical paths seems to be unaffected,
> but I can't guarantee since v4 isn't in use (where I 1st noticed and
> suffer from) and I just did one comparing in order to narrow down
> (asymmetric FIB setup regarding inet and inet6).
>
> What this made complicated: ng_brige(4), mpd5/pppoe,ppt,bhyve are
> involved as well (and vlan(4), lagg(4) and vtnet(4), etc.), but it seems
> to be just a e1000 driver issue.
> There were many changes/iprovements/cleanups between July and September,
> but I tracked it down as root cause for my IPv6 issue (performance
> dropping from 33MB/s to <=0.3MB/s).
>
>
> That beeing said, it was hard to find the time replicating the setup,
> and I have nothing for a solution.  Haven't semantically checked
> anything yet and didn't do any tests beside my single IPv6 performance
> test.  Contrary to my first suspicion, at least in my clone-lab, it
> isn't MTU/jumbo frame related, just plain e1000/i350 IPv6 regression.
>
>
> Happy to test anything, can test-drive swiftly but without further diag
> during work days.
>
> Thanks,
> -harry

Thanks for the report.  I added Franco and Stephen to the cc for visibility.

Nothing is immediately jumping out at me, in a private email Harry
tried not setting 'rxcsum |= E1000_RXCSUM_TUOFL | E1000_RXCSUM_IPOFL'
which was an intentional behavior change but it did not improve this
IPv6 use.

I will need to do some document reading.

Regards,
Kevin