Deadlock in nullfs/zfs somewhere

Konstantin Belousov kostikbel at gmail.com
Thu Jul 18 11:28:18 UTC 2013


On Thu, Jul 18, 2013 at 12:33:58PM +0300, Andriy Gapon wrote:
> on 17/07/2013 20:19 Adrian Chadd said the following:
> > On 17 July 2013 04:26, Andriy Gapon <avg at freebsd.org> wrote:
> >> One possibility is to add getnewvnode_reserve() calls before the ZFS transaction
> >> beginnings in the places where a new vnode/znode may have to be allocated within
> >> a transaction.
> >> This looks like a quick and cheap solution but it makes the code somewhat messier.
> >>
> >> Another possibility is to change something in VFS machinery, so that VOP_RECLAIM
> >> getting blocked for one filesystem does not prevent vnode allocation for other
> >> filesystems.
> >>
> >> I could think of other possible solutions via infrastructural changes in VFS or
> >> ZFS...
> > 
> > Well, what do others think? This seems like a showstopper for systems
> > with lots and lots of ZFS filesystems doing lots and lots of activity.
> > 
> 
> Looks like others are not speaking yet :-)
> 
> My current idea is that ZFS should set MNTK_SUSPEND in zfs_suspend_fs() path
> before acquiring its z_teardown* locks.  This should make intentions of ZFS
> visible to VFS.  And thus it should prevent VOP_RECLAIM call on a suspended ZFS
> filesystem and that should prevent vnlru_free() getting stuck.
> Hopefully this should break the deadlock cycle.
> 
> Kostik,
> 
> what is your opinion?
> For your convenience here is a message with my analysis of this issue:
> http://thread.gmane.org/gmane.os.freebsd.current/150889/focus=18534

Well, I have no opinion.  Making the fs suspended, in other words, preventing
writers from entering the filesystem code, is probably good.  I do not
know zfs code to usefully comment on the approach.

Note that you must drain existing writers, i.e. call vfs_write_suspend(),
to set MNTK_SUSPEND.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20130718/28625e1d/attachment.sig>


More information about the freebsd-fs mailing list