Re: init / supervisor in jail

From: James Gritton <jamie_at_freebsd.org>
Date: Mon, 10 Nov 2025 19:16:01 UTC
On 2025-11-10 04:27, Andriy Gapon wrote:
> I played a little bit with OCI containers and podman.
> I had a hiccup with one specific container created for Docker / Linux.
> Its difference from other containers is that it uses multiple daemons 
> and a supervisor process to take care of them.  That particular 
> supervisor is another variation of "advanced init", it's called s6.  
> Apparently, it is relatively popular for container use (not sure about 
> host systems).  Probably other alternatives can be / are used for that 
> purpose as well.
> 
> I think that this is what a supervisor in a container needs:
> 1. its PID is 1;
> 2. orphaned processes get re-parented to it.
> 
> I think that (1) is not a hard requirement, but it's an easy way to 
> check if the process would be able to work as init.
> Also, some other processes might expect to find init at PID 1, but I am 
> not sure about that.
> 
> (2) is important for doing the supervising (at least, when 
> procctl(PROC_REAP*) is not used) .
> 
> I think that on Linux they have separate PID namespace per container, 
> so the first process to run naturally gets PID 1.
> 
> I think that per-container PID namespace may be an overkill.
> Maybe there is a way to make PID 1 special without going that way.
> 
> E.g., a jail could record the first process it runs.
> We can patch up getpid() to return 1 for that process.
> Also, we could patch up the process lookup to return the first process 
> in the jail for PID 1.
> 
> Re-parenting to the "jail init" sounds harder but should be possible as 
> well (e.g., using PROC_REAP).
> 
> Not sure what to do if the "jail init" dies... should all processes in 
> the jail get killed and the jail should die as well (unless 
> persistent)?
> 
> This proposal sounds like a kludge but it could be a shortcut to 
> support more Linux containers and to allow similar FreeBSD jails / 
> containers with alternative init-s / supervisors.

Far from being a kludge, I think it's a feature we need, and one at the 
top of my list.  Forcing it to look like PID 1 from jailed perspective 
is definitely doable (and something I'd done outside of the project a 
decade ago).  In addition to those two requirements, I would add one 
that answers your last question:

3. signals to init and reboot(2) work as they would on the host side.

A jailed reboot would kill all processes and restart rc, and possibly do 
other kernel-side cleanups yet to be clearly defined.  A jailed halt 
would remove the jail.  A jailed single-user mode could exist where 
instead of init spawning a shell, it just sits around while the system 
has a chance to jexec into it.

init handles various signals by rebooting/halting/etc, and it should be 
able to do that as it does now, by calling reboot(2), directing the 
kernel to do what it needs to with the jail.  If init goes away, it's 
probably like a halt and removes the jail.

This is definitely something that will be happening.

- Jamie