Need to force sync(2) before umounting UFS1 filesystems?

Kirk McKusick mckusick at mckusick.com
Fri Sep 30 13:25:24 UTC 2011


> Date: Thu, 29 Sep 2011 18:13:34 +0200
> Subject: Re: Need to force sync(2) before umounting UFS1 filesystems?
> From: Attilio Rao <attilio at freebsd.org>
> To: Kirk McKusick <mckusick at mckusick.com>,
>         Konstantin Belousov <kib at freebsd.org>
> Cc: Garrett Cooper <yanegomi at gmail.com>, freebsd-fs at freebsd.org,
>         Xin LI <delphij at freebsd.org>
> 
> 2011/9/29 Kirk McKusick <mckusick at mckusick.com>:
> 
> > I am definitely not in a rush on this, so by all means take some time
> > to look it over. The EBUSY unmount has been in its current state
> > for several years, so I am fine with taking a few weeks to sort out
> > the correct solution. Indeed, I am glad that Garrett has volunteered
> > to do some more serious testing.
> >
> > If this general approach is not correct, I can put a hook in for just
> > UFS so that it can have its historic behavior. As you have noted, the
> > SU code has a lot of activity that gets done under the protection of
> > vfs_busy. So it may be the only filesystem for which draining the
> > vfs_busy lock during unmount is needed.
> 
> Honestly, my first thought was exactly that -- an option that was
> forcing the unmount sleep if SU is compiled in the kernel.

The above would solve the problem at hand (NanoBSD builds). But I
believe that this issue can arise with any filesystem that has
background behavior such as ZFS and NFS. It is just most evident
with UFS using SU.

> When you mention 'historical behaviour' you mean the behaviour UFS had
> even prior the introduction of the 'complete' VFS layer or it was the
> behaviour unmount(2) was expected to implemented since the beginning?

Synchronous unmount has exisited since at least the UNIX versions
that I used in the 1970's.

> My guess is that recent SU improvement by you and Jeff may have lead
> to higher vfs_busy() contention, thus making this behaviour just more
> visible.

You are correct. SU has a lot of background activity. And as memories
have grown bigger, the backlog has been able to get bigger and hence
more noticable. On a busy system I have measured over fourty calls to
"sync; sleep 1; umount" before the filesystem would finally unmount
successfully.

> BTW, I'm afraid the forced unmount case may have a possible deadlock.
> 
> thread1 is doing whatever codepath it wants and thread2 is doing
> unmount (forced right now):
> 
> thread1::vfs_busy()
> thread2::lock coveredvnode
> thread1::contests coveredvnode
> thread2::sleep because of thread1 vfs_busy

I agree that the above deadlock is possible. See suggested solution below.

> I think this deadlock was actually possible even with the old code, it
> was just a LOR between a vnode lock and mount lock.

Yes, this is a long-standing issue.

> I'm not sure if there was any invariant I discussed with Kostik in the
> past, preventing one way or another I'm forgetting about, but it seems
> a possible deadlock to me.
> 
> If you see this issue I'll make a patch for it.
> 
> Attilio

Here is my proposed fix. It does the unroll originally found in the
non-FORCE case before sleeping waiting for the vfs_busy to clear.
Is it acceptable to hold the mount mutex while calling VOP_UNLOCK?
If not, then it needs to be released before the unlock, reacquired
afterwards, and the check to see if the sleep is needed redone.

Index: vfs_mount.c
===================================================================
--- vfs_mount.c	(revision 225881)
+++ vfs_mount.c	(working copy)
@@ -1187,6 +1187,7 @@
 
 	mtx_assert(&Giant, MA_OWNED);
 
+top:
 	if ((coveredvp = mp->mnt_vnodecovered) != NULL) {
 		mnt_gen_r = mp->mnt_gen;
 		VI_LOCK(coveredvp);
@@ -1227,21 +1228,19 @@
 		mp->mnt_kern_flag |= MNTK_UNMOUNTF;
 	error = 0;
 	if (mp->mnt_lockref) {
-		if ((flags & MNT_FORCE) == 0) {
-			mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ |
-			    MNTK_UNMOUNTF);
-			if (mp->mnt_kern_flag & MNTK_MWAIT) {
-				mp->mnt_kern_flag &= ~MNTK_MWAIT;
-				wakeup(mp);
-			}
-			MNT_IUNLOCK(mp);
-			if (coveredvp)
-				VOP_UNLOCK(coveredvp, 0);
-			return (EBUSY);
+		mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ |
+		    MNTK_UNMOUNTF);
+		if (mp->mnt_kern_flag & MNTK_MWAIT) {
+			mp->mnt_kern_flag &= ~MNTK_MWAIT;
+			wakeup(mp);
 		}
+		if (coveredvp)
+			VOP_UNLOCK(coveredvp, 0);
 		mp->mnt_kern_flag |= MNTK_DRAINING;
 		error = msleep(&mp->mnt_lockref, MNT_MTX(mp), PVFS,
 		    "mount drain", 0);
+		MNT_IUNLOCK(mp);
+		goto top;
 	}
 	MNT_IUNLOCK(mp);
 	KASSERT(mp->mnt_lockref == 0,

Does this seem like the correct fix to you?

	Kirk McKusick


More information about the freebsd-fs mailing list