kern/184677 / ZFS snapshot handling deadlocks

Andriy Gapon avg at FreeBSD.org
Tue Jan 14 11:47:15 UTC 2014


on 25/12/2013 07:22 krichy at tvnetwork.hu said the following:
> I've made a new patch again, which fixes most of my earlier issues. Mainly, I've
> enabled shared vnode locks for GFS vnodes - is it acceptable? And that way,
> deadlock cases reduced much, and also the patch is more clear (at least to me).
> One thing still remains, the spa_namespace_lock race I mentioned before, I dont
> know how to handle that.
> 
> Waiting for comments on this.

Richard,

first of all, apologies for the delay with a reply and for not having any
comment on the essence of your patch...

I do have the following meta-comment.

- working with FreeBSD VFS is hard
- porting code that was written for a very different VFS model to FreeBSD VFS is
even harder
- doing the same for the code that plays various tricks, like .zfs support code
does, is harder again
- reviewing somebody else's changes to that kind of code is harder still than
making such changes

This is quite an unfortunate situation.  I am not much surprised by the lack of
followups to your analysis and patches.
I am saying this as someone who spent some time analyzing and trying to fix the
.zfs code and ZFS<->VFS interaction in general.  See e.g.
http://thread.gmane.org/gmane.os.freebsd.devel.file-systems/18924/focus=19056

My opinion is that .zfs code breaks several fundamental FreeBSD VFS contracts.
The most glaring violation is that VOP_INACTIVE makes a vnode invalid.
I think that trying to fix .zfs code by patching individual problems here and
there is an uphill battle.  I think that the same also applies to ZFS ZPL code
but in a less obvious way.
The code in many cases just pretends to play by VFS rules by satisfying most
obvious assertions, but it does not really try to obey VFS contracts.  For
example, almost all locks in znode are mostly redundant given VFS vnode locking.
 But in some cases they are not sufficient precisely because VFS expects its
locking to be used rather than ZFS internal locking.  The most obvious example
is interaction of VOP_RENAME with other VOPs.

In any case, thanks for your work!  I am trying to find some time to review it.

-- 
Andriy Gapon


More information about the freebsd-fs mailing list