Re: git: eb93b99d6986 - main - in_pcb: delay crfree() down into UMA dtor

From: Kristof Provost <kp_at_FreeBSD.org>
Date: Wed, 15 Dec 2021 09:47:42 UTC
On 15 Dec 2021, at 6:58, Gleb Smirnoff wrote:
> On Tue, Dec 14, 2021 at 10:42:49PM +0100, Kristof Provost wrote:
> K> >     in_pcb: delay crfree() down into UMA dtor
> K> >
> K> >     inpcb lookups, which check inp_cred, work with pcbs that
> K> > potentially went
> K> >     through in_pcbfree().  So inp_cred should stay valid until SMR
> K> > guarantees
> K> >     its invisibility to lookups.
> K> >
> K> >     While here, put the whole inpcb destruction sequence of
> K> > in_pcbfree(),
> K> >     inpcb_dtor() and inpcb_fini() sequentially.
> K> >
> K> >     Submitted by:           markj
> K> >     Differential revision:  https://reviews.freebsd.org/D33273
> K>
> K> For some reason it looks like this commit causes jails to fail to get
> K> fully cleaned up.
> K> I can reproduce that trivially with `cd /usr/tests/sys/net ; kyua test
> K> if_bridge_test:bridge_transmit_ipv4_unicast ; jls -na`.
> K>
> K> Note the jails in dying state.
> K>
> K> The jails created by that test never go away. It’s as if
> K> `crfree(inp->inp_cred);` doesn’t actually get called. And indeed, it
> K> looks like inpcb_dtor() does not get called at all.
>
> Yes, I faced this problem today, too. :(
>
> My radical opinion is that per-VNET pcb zones should just be eliminated.
> The only thing they serve is imposing maxsockets limit separately for
> each VNET. But we already have the maxsocket limit on the socket zone,
> which is _global_!
>
> Anybody to explain me the sense of the per-VNET per-pcb zone limit
> set to the same maxsockets value? You can't create a pcb without a
> socket, which is guaranteed by the in_pcballoc() prototype. Of course
> I understand that pcbs may outlive the socket. But those pcbs that
> outlive a socket, are eventually garbage collected as their lifetime
> is finite. Anyway jail/VNET was never declared as a resource management
> framework anyway!
>

rctl(8) does appear to support per-jail resource limits, but I’m not sure how complete or functional that is.

I don’t really have any strong feelings either way.

> So, for this particular problem I would suggest just eliminate per-VNET
> pcb zones, but in general the fact that idle SMR zone may never purge
> its cache sucks and needs improvement.
>
Yeah, that’s certainly going to need some love at some point.

Kristof