Deadlock in nullfs/zfs somewhere
Konstantin Belousov
kostikbel at gmail.com
Sun Aug 18 15:04:34 UTC 2013
On Wed, Aug 07, 2013 at 11:47:44AM +0300, Andriy Gapon wrote:
>
> Kostik,
>
> thank you for being patient with me and explaining details of the contract and
> inner workings of VFS suspend.
>
> As we discussed out-of-band, unfortunately, it seems that it is impossible to
> implement the same contract for ZFS. The reason is that ZFS filesystems appear
> as many independent filesystems, but in reality they share a common pool. So
> suspending a single filesystem does not suspend the pool and that is contrary to
> current VFS suspend concept.
>
> Additionally, ZFS needs a "full" suspend mechanism that would prevent both read
> and write access from VFS layer. The current VFS suspend mechanism suspend
> writes / modifications only.
>
> I am not sure how to reconcile the differences...
> Here is a number of rough ideas. I will highly appreciate your opinion and
> suggestions.
>
> Idea #1.
> Add a new suspend type to VFS layer that would correspond to the needs of ZFS.
> This is quite laborious as it would require adding vn_start_read calls in many
> places. Also, making two kinds of VFS suspend play nice with each other could
> be non-trivial.
If you mean a 'full suspend' mechanism which is to be added, as opposed
to the existing 'write suspend', then yes, this is a correct approach,
which would probably be useful outside ZFS as well. It's immediate
application is e.g. for the unmounts.
It is indeed very laborous and probably quite non-trivial, since the
suspend lock should be before any filesystem-level blocking primitives,
probably including vfs_busy().
>
> Idea #2.
> This is perhaps an ugly approach, but I already have it implemented locally.
> The idea is to re-use / abuse vnode locking as a ZFS suspend barrier.
> (This can be considered to be analogous to putting vn_start_op() / vn_end_op()
> into vop_lock / vop_unlock).
> That is, ZFS would override VOP_LOCK/VOP_UNLOCK to check for internal
> suspension. The necessary care would be taken to respect all locking flags
> including LK_NOWAIT. Recursive entry would have to be supported too.
Please note that nandfd used somewhat similar approach, where it caused
obvious bugs last time I looked. At least, lookups were knowingly broken
regarding to lock order.
Devfs uses internal lock to protect the mount point, which is after
vnode locks. Correcting the operation of the dm_lock required quite
an efforts, look at the DEVFS_DMP_DROP etc in devfs code.
If this is constrained to zfs without any effect on VFS, I do not care.
>
> Idea #3.
> Provide some other mechanism to expose ZFS suspension state to VFS. And then
> use that mechanism to avoid blocking on calls to ZFS in the strategic /
> sensitive places like vlrureclaim(), vtryrecycle(), etc.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20130818/a1995984/attachment.sig>
More information about the freebsd-fs
mailing list