Re: git: eb93b99d6986 - main - in_pcb: delay crfree() down into UMA dtor

From: Gleb Smirnoff <glebius_at_freebsd.org>
Date: Wed, 29 Dec 2021 01:57:21 UTC
On Wed, Dec 29, 2021 at 12:29:47AM +0100, Marko Zec wrote:
M> > On Fri, Dec 17, 2021 at 10:17:55PM -0800, Gleb Smirnoff wrote:
M> > T> T> K> The jails created by that test never go away. It’s as if 
M> > T> T> K> `crfree(inp->inp_cred);` doesn’t actually get called. And
M> > T> T> K> indeed, it looks like inpcb_dtor() does not get called at
M> > T> T> K> all.  
M> > T> T> 
M> > T> T> Yes, I faced this problem today, too. :(
M> > T> T> 
M> > T> T> My radical opinion is that per-VNET pcb zones should just be
M> > T> T> eliminated. The only thing they serve is imposing maxsockets
M> > T> T> limit separately for each VNET. But we already have the
M> > T> T> maxsocket limit on the socket zone, which is _global_!
M> > T> T> 
M> > T> T> Anybody to explain me the sense of the per-VNET per-pcb zone
M> > T> T> limit set to the same maxsockets value? You can't create a pcb
M> > T> T> without a socket, which is guaranteed by the in_pcballoc()
M> > T> T> prototype. Of course I understand that pcbs may outlive the
M> > T> T> socket. But those pcbs that outlive a socket, are eventually
M> > T> T> garbage collected as their lifetime is finite. Anyway jail/VNET
M> > T> T> was never declared as a resource management framework anyway!
M> > T> T> 
M> > T> T> So, for this particular problem I would suggest just eliminate
M> > T> T> per-VNET pcb zones, but in general the fact that idle SMR zone
M> > T> T> may never purge its cache sucks and needs improvement.  
M> > T> 
M> > T> I have created a patch that would mitigate that problem. Once the
M> > T> zones are global, the jails will eventually die if there is some
M> > T> pcb zone traffic.
M> > T> 
M> > T> https://reviews.freebsd.org/D33542  
M> > 
M> > Despite I still believe that PCB zones belong to global state rather
M> > than to a VNET, the patch doesn't help to mitigate massive memory
M> > leaks with vnet jails on a machine that is dedicated solely to run a
M> > test suite. If machine does nothing except a test suite, there is
M> > almost zero pcb traffic. If there is no pcb zone traffic, the SMR
M> > caches stay, and thus destroyed jails will also stay. Our vnet jail
M> > "weights" a lot! Even with the global PCB zone patch applied, each
M> > vnet jail creates 33 UMA zones!
M> > 
M> > I think we need a KPI to purge the SMR caches, and we also need to put
M> > vnet jails on a diet. These are two independent problems, of course.
M> 
M> +1 for nuking all per-vnet PCB zones and the alike!  At the time I
M> V_irtualized them during the early stages of VNET implementaion, the
M> focus was on correctnes and tracking of inter-vnet resource leaks. Once
M> that step was reasonably completed (circa 15 years ago!), per-VNET
M> zones became a pure waste of memory, amplified with per-CPU local free
M> pools for each zone, not to mention the PITA with VNET cleanups...
M> 
M> If memory still serves me well, a few folks asserted that per-VNET zones
M> could be useful for hypothetical VNET snapshots / live migration to
M> another machine, a project I heard about on several occasions but have
M> never seen it...
M> 
M> So, by all means go ahead and devirtualize them all...

Let's begin with PCB zones, reviews are ready for review:

https://reviews.freebsd.org/D33542  (click on Stack, to see dependency
reviews)

-- 
Gleb Smirnoff