cvs commit: src/sys/sys msg.h sem.h shm.h

Robert Watson rwatson at freebsd.org
Sat Nov 20 13:16:15 GMT 2004


On Sat, 20 Nov 2004, Alexander Leidinger wrote:

> On Fri, 19 Nov 2004 13:14:50 +0000 (GMT)
> Robert Watson <rwatson at freebsd.org> wrote:
> 
> > - If you have multiple name spaces, it makes it hard for the administrator
> >   running outside the jail to track and manage IPC resources that are
> >   leaked in Jails.  ipcs and ipcrm are written under the assumption of a
> >   single name space, and the whole management infrastructure and APIs
> >   there will become substantially more complicated if multiple name spaces
> >   exist.  Especially given that the resource limits for System V IPC are
> >   both very concrete and global.
> 
> Are you talking about the userland API, or about the in-kernel API? 

Userland API; implementing the kernel side, modulo dealing with the
loading and unloading issue, is relatively straight forward.

> If you are talking about the userland API: wouldn't it be more easy if
> we use the following constraints?
>  - The admin of the host has no direct access to the jails IPC, only an 
>    admin in the jail can manage it (the host admin can use jexec to  
>    manage IPC).
>  - If a jail gets shut down, all IPC resources of this jail are removed.

Sure.  But that makes it fairly inconvenient to track resource usage over
a large number of jails.  Consider ps(1)/kill(1) as a preferred example of
how name spacing might work: if the administrator wants to track and limit
excessive CPU use by mis-behaving applications, they can run inside or
outside the jail.  Inside the jail, reporting is limited to processes
associated with the jail, as is remedial action (kill).  Outside the jail,
the scope of monitoring is the whole system, and likewise, the scope of
action is the whole system.  Before we walk too far into virtualizing
System V IPC for use in jail, we need to provide that level of flexibility
for its resources: the ability to accurately and conveniently track
allocation of resources both system-wide and per-jail, and manage those
resources in a similar form.  This can be done, but that's not sufficient: 
it also has to actually be done :-).

One problem I have with pushing processes into jails to manage the
resources of jails is that it makes the process a member of the jail, and
therefore more vulnerable to attacks from the jail.  It's also a
heavy-weight mechanism, requiring the discarding of the process before you
can move onto the next jail, since attaching is one-way (for good reason).

A preferred model might choose to modify the export of System V IPC
management information using sysctl so that it reflects the notion of
multiple name spaces.  I.e., right now sysctl_shmsegs has no real scoping
notion, simply dumping all the segment descriptors to userspace.  Assuming
we had a strong notion of named or numbered name spaces, that monitoring
interface would need to take it into account, dumping only appropriate
segment information.  We also get to make a design choice: do we tie the
virtualization of the System V IPC name spaces to the jail(4) mechanism,
or do we genericize it and have jail simply take advantage of that.  For
example, chroot(2) exists independet of jail, but is used by jail as part
of constructing a name space.  When I last experimented with name spaces
and Jail, I added a new System V IPC name space pointer to each process,
had a name space creation call, and prevented jails from changing name
spaces.  This is flexible and reusable, but also makes management a bit
harder from a pure Jail perspective, since there's a level of indirection
-- you use the jail reference to the name space to identify which name
space to operate on.

Anyhow, I'd encourage people to experiment, but with the understanding
that the name space issues are not entirely trivial, especially if you
want to maintain the ease of use and configurability of Jail by avoiding
introducing too much complexity or limiting/breaking current tools.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert at fledge.watson.org      Principal Research Scientist, McAfee Research




More information about the cvs-src mailing list