NFS + nullfs + jail = zombies?
James Gritton
jamie at gritton.org
Sat Jul 9 14:23:08 UTC 2016
On 2016-07-08 12:28, Thomas Johnson wrote:
> I am working on developing a clustered application utilizing jails and
> running into problems that seem to be NFS-related. I'm hoping that
> someone can point out my error.
>
> The jail images and my application data are served via NFS. The host
> mounts NFS at boot, and then uses nullfs mounts to assemble the jail
> tree when the jail is created (fstab files and jail.conf are below).
> This seems to work fine, the jail starts and is usable. The problem
> comes when I remove/restart the jail. Frequently (but not
> consistently), the jail gets stuck in a dying state, causing the
> unmount of the jail root (nullfs) to fail with a "device busy" error.
>
> # jail -f /var/local/jail.conf -r wds1-1a
> Stopping cron.
> Waiting for PIDS: 1361.
> .
> Terminated
> wds1-1a: removed
> umount: unmount of /var/jail/wds1-1a failed: Device busy
> # jls -av
> JID Hostname Path
> Name State
> CPUSetID
> IP Address(es)
> 1 wds1-1a /var/jail/wds1-1a
> wds1-1a DYING
> 2
> 2620:1:1:1:1a::1
>
> Through trial-and-error I have determined that forcing an unmount of
> the root works, but subsequent mounts to that mount point will fail to
> unmount with the same error. Deleting and recreating the mountpoint
> fixes the mounting issue, but the dying jail remains permanently.
>
> I have also found that if I copy the jail root to local storage and
> update the jail's fstab to nullfs mount this, the problem seems to go
> away. This leads me to believe that the issue is related to the NFS
> source for the nullfs mount. statd and lockd are both running on the
> host.
>
> My relevant configurations are below. I can provide any other
> information desired.
>
> # Host fstab line for jail root.
> #
> 10.219.212.1:/vol/dev/wds/jail_base /jail/base nfs ro 0 0
>
>
> # Jail fstab file (mount.fstab)
> #
> /jail/base /var/jail/wds1-1a nullfs ro 0 0
> # writable (UFS-backed) /var
> /var/jail-vars/wds1-1a /var/jail/wds1-1a/var nullfs rw 0 0
>
>
> # jail.conf file
> #
> * {
> devfs_ruleset = "4";
> mount.devfs;
> exec.start = "/bin/sh /etc/rc";
> exec.stop = "/bin/sh /etc/rc.shutdown";
> interface = "vmx1";
> allow.dying = 1;
> exec.prestart = "/usr/local/bin/rsync -avC --delete
> /jail/${image}/var/ /var/jail-vars/${host.hostname}/";
> }
>
> # JMANAGE wds1-1a
> wds1-1a {
> path = "/var/jail/wds1-1a";
> ip6.addr = "2620:1:1:1:1a::1";
> host.hostname = "wds1-1a";
> host.domainname = "dev";
> mount.fstab = "/var/local/fstab.wds1-1a";
> $image = "base";
> }
What happens if you take jails out of the equation? I know this isn't
entirely a non-jail issue, but I wonder if a jail is required for the
mount point to be un-re-mountable. I've heard before of NFS-related
problems where a jail remains dying forever, but this has been more of
an annoyance than a real problem.
It's not so much that I want to absolve jails, as I want to see where
the main fight exists. It's tricky enough fixing an interface between
two systems, but we've got three here.
- Jamie
More information about the freebsd-jail
mailing list