Phantom Jails

Dirk Engling erdgeist at erdgeist.org
Fri Nov 17 03:20:31 UTC 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Rumors went around and tales were told about jails magically booing
around in prison list, even after they deceased.

Most people consider this a rather aesthetical issue, however if you run
your jails from directories that need to be unmounted (e.g. from
md-images, on external drives, from gbde or geli images) those phantom
jails become rather annoying, since you cannot umount their roots.

Investigations have shown, that

1) sockets hold a lock on (increase reference counter in) the ucred
structure of the calling process
2) This ucred structure in turn keeps a lock on (increases reference
counter in) the prison struct representing the jail this process belongs to
3) The prison struct in turn keeps a handle to jails root directory.

If a process holding a tcp connection is killed, the connection is being
inherited by the kernel. It waits there for tcp tear down or tcp time
out to occur. Only then socket's lock on ucred is released, which
releases ucreds lock on prison struct (thus terminating phantom jails)
which may, if it is the last ucred referencing the prison, release the
prison and its handle to the root directory (solving my un-umount-able
images).

There were kinds of phantom jails being sighted, that did not vanish
after tcp timeout, that might be deadlocked by open files or mmaped
regions. However the above case happens regularly with my mail server
jail that holds hundreds of imap-connections, one disconnected dsl-user
can prevent tcp tear down to happen successfully thus forcing me to
force umount the mail server.

My suggestion would be (I will provide a patch, if discussion produces
no major disagreement) to release ucred structs held by sockets as soon
as the process dies. They are being used for accounting purposes only,
anyway. The same may apply to the other types of phantom jails, as well.
I could not create those deliberately and therefore can not exactly spot
the proper location to fix.

Comments?

  erdgeist

P.S.: if you want to reproduce a phantom jail try the following:
1) create and start a jail
2) Start a ssh/web/whatever server within the jail
3) Connect to that server from the host system.
4) Keep this connection open while you kill the jail
5) Do a 'jls' and compare its output to "ps axuu | grep J"
6) Kill the process that connected to the service.
7) Do a 'jls' again.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFFXSp5ImmQdUyYEgkRAtOAAJ4iSzyu2LOf+RBNArvYAk1Tv8cssACfRxJa
12OGEwWugcIDhlGGTHJrz0o=
=gXK8
-----END PGP SIGNATURE-----


More information about the freebsd-hackers mailing list