ZFS panic in zone_dataset_visible
Scott Burns
scott at bqinternet.com
Mon Sep 22 17:21:18 UTC 2008
Scott Burns wrote:
> Hello,
>
> I am running several servers using Pawel's July 27 ZFS patchset, applied
> against 8-current source from the same day. I have seen a similar panic
> on two different servers:
...
> Stopped at _mtx_lock_flags+0x15: lock cmpxchgq %rsi,0x18(%rdi)
> db> bt
> Tracing pid 95276 tid 100432 td 0xffffff010b3cc000
> _mtx_lock_flags() at _mtx_lock_flags+0x15
> zone_dataset_visible() at zone_dataset_visible+0x94
> zfs_mount() at zfs_mount+0x3e5
...
With a bit of testing, I found that this panic is easily reproducible.
Simply try to list the contents of a snapshot from within a jail, as
long as the snapshot isn't already mounted, and the system panics. If I
mount the snapshot from outside of the jail first, and then list it
inside the jail, it does not panic.
I spent a bit of time debugging this weekend. Trying to list an
unmounted snapshot triggers a zfs_mount() for the snapshot, which calls
zone_dataset_visible() to determine if the snapshot should be visible in
the current zone. When it is run outside of a jail, it returns true
early on because INGLOBALZONE(curproc) is true, otherwise it takes
another code path.
The panic is happening after that check, at mtx_lock(&pr->cr_mtx),
because (pr = curthread->td_ucred->cr_prison) is NULL. Interestingly,
it's not NULL if zone_dataset_visible() is triggered by a "zfs list"
command, but it is NULL if zone_dataset_visible() is called from
zfs_mount().
As a temporary workaround, I modified my copy of
cddl/compat/opensolaris/kern/opensolaris_zone.c to have
zone_dataset_visible() return true if it is being called for a snapshot.
I modified it as below:
-if (INGLOBALZONE(curproc))
+if (INGLOBALZONE(curproc) || strchr(dataset, '@'))
This is obviously not ideal, since it allows the manipulation of the
snapshot from another jail if the caller knows that it exists. Since I
am the only one with root access to any of the jails, I am not concerned
with that. "zfs list" continues to behave normally.
I will continue looking at this, but since my main goal of working
around the panic has been taken care of, I am not sure how long my
attention span will last. If the cause of
curthread->td_ucred->cr_prison being NULL under these conditions is
obvious to anyone, please let me know.
--
Scott Burns
System Administrator
BQ Internet Corporation
More information about the freebsd-current
mailing list