NFS + nullfs + jail = zombies?
Thomas Johnson
tommyj27 at gmail.com
Sun Jul 10 15:28:45 UTC 2016
If NFS and jails are known to misbehave, I can look at alternatives.
That said, if there is interest in troubleshooting this further, I am
set-up to play guinea pig.
To test this scenario without jails, I set jail_enable="NO" and moved
the nullfs lines from my "jail" fstab into /etc/fstab. After a
half-dozen cycles of "reboot, generate activity on the mount, umount",
I have been unable to reproduce the behavior. FWIW, I did have to set
the nullfs mounts "late" in this configuration to get NFS mounted
before creating the nullfs.
On Sat, Jul 9, 2016 at 9:22 AM, James Gritton <jamie at gritton.org> wrote:
> On 2016-07-08 12:28, Thomas Johnson wrote:
>>
>> I am working on developing a clustered application utilizing jails and
>> running into problems that seem to be NFS-related. I'm hoping that
>> someone can point out my error.
>>
>> The jail images and my application data are served via NFS. The host
>> mounts NFS at boot, and then uses nullfs mounts to assemble the jail
>> tree when the jail is created (fstab files and jail.conf are below).
>> This seems to work fine, the jail starts and is usable. The problem
>> comes when I remove/restart the jail. Frequently (but not
>> consistently), the jail gets stuck in a dying state, causing the
>> unmount of the jail root (nullfs) to fail with a "device busy" error.
>>
>> # jail -f /var/local/jail.conf -r wds1-1a
>> Stopping cron.
>> Waiting for PIDS: 1361.
>> .
>> Terminated
>> wds1-1a: removed
>> umount: unmount of /var/jail/wds1-1a failed: Device busy
>> # jls -av
>> JID Hostname Path
>> Name State
>> CPUSetID
>> IP Address(es)
>> 1 wds1-1a /var/jail/wds1-1a
>> wds1-1a DYING
>> 2
>> 2620:1:1:1:1a::1
>>
>> Through trial-and-error I have determined that forcing an unmount of
>> the root works, but subsequent mounts to that mount point will fail to
>> unmount with the same error. Deleting and recreating the mountpoint
>> fixes the mounting issue, but the dying jail remains permanently.
>>
>> I have also found that if I copy the jail root to local storage and
>> update the jail's fstab to nullfs mount this, the problem seems to go
>> away. This leads me to believe that the issue is related to the NFS
>> source for the nullfs mount. statd and lockd are both running on the
>> host.
>>
>> My relevant configurations are below. I can provide any other
>> information desired.
>>
>> # Host fstab line for jail root.
>> #
>> 10.219.212.1:/vol/dev/wds/jail_base /jail/base nfs ro 0 0
>>
>>
>> # Jail fstab file (mount.fstab)
>> #
>> /jail/base /var/jail/wds1-1a nullfs ro 0 0
>> # writable (UFS-backed) /var
>> /var/jail-vars/wds1-1a /var/jail/wds1-1a/var nullfs rw 0 0
>>
>>
>> # jail.conf file
>> #
>> * {
>> devfs_ruleset = "4";
>> mount.devfs;
>> exec.start = "/bin/sh /etc/rc";
>> exec.stop = "/bin/sh /etc/rc.shutdown";
>> interface = "vmx1";
>> allow.dying = 1;
>> exec.prestart = "/usr/local/bin/rsync -avC --delete
>> /jail/${image}/var/ /var/jail-vars/${host.hostname}/";
>> }
>>
>> # JMANAGE wds1-1a
>> wds1-1a {
>> path = "/var/jail/wds1-1a";
>> ip6.addr = "2620:1:1:1:1a::1";
>> host.hostname = "wds1-1a";
>> host.domainname = "dev";
>> mount.fstab = "/var/local/fstab.wds1-1a";
>> $image = "base";
>> }
>
>
> What happens if you take jails out of the equation? I know this isn't
> entirely a non-jail issue, but I wonder if a jail is required for the mount
> point to be un-re-mountable. I've heard before of NFS-related problems
> where a jail remains dying forever, but this has been more of an annoyance
> than a real problem.
>
> It's not so much that I want to absolve jails, as I want to see where the
> main fight exists. It's tricky enough fixing an interface between two
> systems, but we've got three here.
>
> - Jamie
More information about the freebsd-jail
mailing list