Re: git: eb93b99d6986 - main - in_pcb: delay crfree() down into UMA dtor

From: Kyle Evans <kevans_at_freebsd.org>
Date: Wed, 15 Dec 2021 05:50:12 UTC
On Tue, Dec 14, 2021 at 8:35 PM Mark Johnston <markj@freebsd.org> wrote:
>
> On Tue, Dec 14, 2021 at 10:42:49PM +0100, Kristof Provost wrote:
> > On 5 Dec 2021, at 19:47, Gleb Smirnoff wrote:
> > > The branch main has been updated by glebius:
> > >
> > > URL:
> > > https://cgit.FreeBSD.org/src/commit/?id=eb93b99d698674e3b1cc7139fda98e2b175b8c5b
> > >
> > > commit eb93b99d698674e3b1cc7139fda98e2b175b8c5b
> > > Author:     Gleb Smirnoff <glebius@FreeBSD.org>
> > > AuthorDate: 2021-12-05 16:47:24 +0000
> > > Commit:     Gleb Smirnoff <glebius@FreeBSD.org>
> > > CommitDate: 2021-12-05 18:46:37 +0000
> > >
> > >     in_pcb: delay crfree() down into UMA dtor
> > >
> > >     inpcb lookups, which check inp_cred, work with pcbs that
> > > potentially went
> > >     through in_pcbfree().  So inp_cred should stay valid until SMR
> > > guarantees
> > >     its invisibility to lookups.
> > >
> > >     While here, put the whole inpcb destruction sequence of
> > > in_pcbfree(),
> > >     inpcb_dtor() and inpcb_fini() sequentially.
> > >
> > >     Submitted by:           markj
> > >     Differential revision:  https://reviews.freebsd.org/D33273
> > >
> >
> > For some reason it looks like this commit causes jails to fail to get
> > fully cleaned up.
> > I can reproduce that trivially with `cd /usr/tests/sys/net ; kyua test
> > if_bridge_test:bridge_transmit_ipv4_unicast ; jls -na`.
> >
> > Note the jails in dying state.
> >
> > The jails created by that test never go away. It’s as if
> > `crfree(inp->inp_cred);` doesn’t actually get called. And indeed, it
> > looks like inpcb_dtor() does not get called at all.
>
> For SMR zones it won't get called until there's some attempt to reuse a
> bucket.  There's a fairly tight bound on this in the sense that there
> can't be a large number of freed items waiting for the dtor to be
> invoked, so the amount of consumed memory should be small, but there's
> no bound on how long one might have to wait.  I didn't realize that jail
> destruction could block waiting for a ucred reference to vanish.
>

Right, we ran into this problem with wireguard-freebsd, as well;
oftentimes, a ucred will be the last prison ref, so we had to hook
jail's osd remove method and more aggressively purge wg interfaces
that'd otherwise be keeping it alive.

> When I tried this patch with poudriere on a large system there was no
> problem, but the regression test suite is a different story.  I'm not
> sure how best to fix this yet.
>