Re: git: 9e6d11ce9a51 - main - vtnet: Adjust rx buffer so IP header 32-bit aligned

From: Warner Losh <imp_at_bsdimp.com>
Date: Thu, 21 Dec 2023 16:53:05 UTC
On Thu, Dec 21, 2023, 8:54 AM Mark Johnston <markj@freebsd.org> wrote:

> On Thu, Dec 21, 2023 at 08:31:36AM -0700, Warner Losh wrote:
> > On Thu, Dec 21, 2023 at 8:13 AM Mark Johnston <markj@freebsd.org> wrote:
> >
> > > On Thu, Dec 21, 2023 at 04:21:37AM +0000, Warner Losh wrote:
> > > > The branch main has been updated by imp:
> > > >
> > > > URL:
> > >
> https://cgit.FreeBSD.org/src/commit/?id=9e6d11ce9a51d75ed6a94e180f2fb4e9188a2ba4
> > > >
> > > > commit 9e6d11ce9a51d75ed6a94e180f2fb4e9188a2ba4
> > > > Author:     Warner Losh <imp@FreeBSD.org>
> > > > AuthorDate: 2023-12-20 19:09:09 +0000
> > > > Commit:     Warner Losh <imp@FreeBSD.org>
> > > > CommitDate: 2023-12-21 04:16:45 +0000
> > > >
> > > >     vtnet: Adjust rx buffer so IP header 32-bit aligned
> > > >
> > > >     Call madj(m, ETHER_ALIGN) to offset rx buffers when allocating
> them.
> > > >     This improves performance everywhere, and allows armv7 to work at
> > > all.
> > > >
> > > >     PR:                     271288 (PR had a different fix than I
> wound
> > > up with)
> > > >     MFC After:              3 days
> > > >     Sponsored by:           Netflix
> > > >     Differential Revision:  https://reviews.freebsd.org/D43136
> > > > ---
> > > >  sys/dev/virtio/network/if_vtnet.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/sys/dev/virtio/network/if_vtnet.c
> > > b/sys/dev/virtio/network/if_vtnet.c
> > > > index 287af7751066..360176e4f845 100644
> > > > --- a/sys/dev/virtio/network/if_vtnet.c
> > > > +++ b/sys/dev/virtio/network/if_vtnet.c
> > > > @@ -1532,8 +1532,8 @@ vtnet_rx_alloc_buf(struct vtnet_softc *sc, int
> > > nbufs, struct mbuf **m_tailp)
> > > >                       m_freem(m_head);
> > > >                       return (NULL);
> > > >               }
> > > > -
> > > >               m->m_len = size;
> > > > +             m_adj(m, ETHER_ALIGN);
> > >
> > > The driver is expecting to get a cluster of size
> sc->vtnet_rx_clustersz,
> > > but now it's getting one of size sc->vtnet_rx_clustersz - 2.  I don't
> > > see how this change can be sufficient on its own: what prevents virtio
> > > from writing sc->vtnet_rx_clustersz bytes and thereby overwriting the
> > > two bytes following the cluster?
> > >
> >
> > The only trouble that I saw was here:
> >
> >                 /*
> >                  * Every mbuf should have the expected cluster size since
> > that
> >                  * is also used to allocate the replacements.
> >                  */
> >                 KASSERT(m->m_len == clustersz,
> >                     ("%s: mbuf size %d not expected cluster size %d",
> > __func__,
> >                     m->m_len, clustersz));
> >
> > and even that's fine: We can adjust that assert. The next lines I think
> > make it fine:
> >                 m->m_len = MIN(m->m_len, len);
> >
> > so we only use what we say is there to land bytes into the mbuf.  I'm now
> > not sure though what to do about a few lines later:
> >
> >        m_new = vtnet_rx_alloc_buf(sc, nreplace, &m_tail);
> >         if (m_new == NULL) {
> >                 m_prev->m_len = clustersz;
> >                 return (ENOBUFS);
> >         }
> >
> > which likely needs to be adjusted (or the old saved size).
>
> I'm not certain about what it should be doing instead, but yes this code
> looks wrong now.
>

Yea. We need to do this at time of use not time of alloc.

> I likely didn't see asserts in my testing because I didn't have LRO slow
> > path packets.
> >
> > Do you see other places that would need adjusting?
>
> Presumably vtnet_rx_cluster_size() needs to be adjusted too?
>
> I also don't understand why we do this adjustment unconditionally
> instead of only when __NO_STRICT_ALIGNMENT is not defined, as almost all
> other NIC drivers do.
>

None of the ones I looked at had this... maybe I got unlucky. I'd think
we'd have better performance if things are aligned all the way... is there
a reason not to?

Warner

>
> > All the other places
> > seem to use m_len correctly on the rx path, but I'm not a nic driver guy,
> > and I might be missing something.
>