Need to force sync(2) before umounting UFS1 filesystems?
Kirk McKusick
mckusick at mckusick.com
Fri Sep 30 13:25:24 UTC 2011
> Date: Thu, 29 Sep 2011 18:13:34 +0200
> Subject: Re: Need to force sync(2) before umounting UFS1 filesystems?
> From: Attilio Rao <attilio at freebsd.org>
> To: Kirk McKusick <mckusick at mckusick.com>,
> Konstantin Belousov <kib at freebsd.org>
> Cc: Garrett Cooper <yanegomi at gmail.com>, freebsd-fs at freebsd.org,
> Xin LI <delphij at freebsd.org>
>
> 2011/9/29 Kirk McKusick <mckusick at mckusick.com>:
>
> > I am definitely not in a rush on this, so by all means take some time
> > to look it over. The EBUSY unmount has been in its current state
> > for several years, so I am fine with taking a few weeks to sort out
> > the correct solution. Indeed, I am glad that Garrett has volunteered
> > to do some more serious testing.
> >
> > If this general approach is not correct, I can put a hook in for just
> > UFS so that it can have its historic behavior. As you have noted, the
> > SU code has a lot of activity that gets done under the protection of
> > vfs_busy. So it may be the only filesystem for which draining the
> > vfs_busy lock during unmount is needed.
>
> Honestly, my first thought was exactly that -- an option that was
> forcing the unmount sleep if SU is compiled in the kernel.
The above would solve the problem at hand (NanoBSD builds). But I
believe that this issue can arise with any filesystem that has
background behavior such as ZFS and NFS. It is just most evident
with UFS using SU.
> When you mention 'historical behaviour' you mean the behaviour UFS had
> even prior the introduction of the 'complete' VFS layer or it was the
> behaviour unmount(2) was expected to implemented since the beginning?
Synchronous unmount has exisited since at least the UNIX versions
that I used in the 1970's.
> My guess is that recent SU improvement by you and Jeff may have lead
> to higher vfs_busy() contention, thus making this behaviour just more
> visible.
You are correct. SU has a lot of background activity. And as memories
have grown bigger, the backlog has been able to get bigger and hence
more noticable. On a busy system I have measured over fourty calls to
"sync; sleep 1; umount" before the filesystem would finally unmount
successfully.
> BTW, I'm afraid the forced unmount case may have a possible deadlock.
>
> thread1 is doing whatever codepath it wants and thread2 is doing
> unmount (forced right now):
>
> thread1::vfs_busy()
> thread2::lock coveredvnode
> thread1::contests coveredvnode
> thread2::sleep because of thread1 vfs_busy
I agree that the above deadlock is possible. See suggested solution below.
> I think this deadlock was actually possible even with the old code, it
> was just a LOR between a vnode lock and mount lock.
Yes, this is a long-standing issue.
> I'm not sure if there was any invariant I discussed with Kostik in the
> past, preventing one way or another I'm forgetting about, but it seems
> a possible deadlock to me.
>
> If you see this issue I'll make a patch for it.
>
> Attilio
Here is my proposed fix. It does the unroll originally found in the
non-FORCE case before sleeping waiting for the vfs_busy to clear.
Is it acceptable to hold the mount mutex while calling VOP_UNLOCK?
If not, then it needs to be released before the unlock, reacquired
afterwards, and the check to see if the sleep is needed redone.
Index: vfs_mount.c
===================================================================
--- vfs_mount.c (revision 225881)
+++ vfs_mount.c (working copy)
@@ -1187,6 +1187,7 @@
mtx_assert(&Giant, MA_OWNED);
+top:
if ((coveredvp = mp->mnt_vnodecovered) != NULL) {
mnt_gen_r = mp->mnt_gen;
VI_LOCK(coveredvp);
@@ -1227,21 +1228,19 @@
mp->mnt_kern_flag |= MNTK_UNMOUNTF;
error = 0;
if (mp->mnt_lockref) {
- if ((flags & MNT_FORCE) == 0) {
- mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ |
- MNTK_UNMOUNTF);
- if (mp->mnt_kern_flag & MNTK_MWAIT) {
- mp->mnt_kern_flag &= ~MNTK_MWAIT;
- wakeup(mp);
- }
- MNT_IUNLOCK(mp);
- if (coveredvp)
- VOP_UNLOCK(coveredvp, 0);
- return (EBUSY);
+ mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ |
+ MNTK_UNMOUNTF);
+ if (mp->mnt_kern_flag & MNTK_MWAIT) {
+ mp->mnt_kern_flag &= ~MNTK_MWAIT;
+ wakeup(mp);
}
+ if (coveredvp)
+ VOP_UNLOCK(coveredvp, 0);
mp->mnt_kern_flag |= MNTK_DRAINING;
error = msleep(&mp->mnt_lockref, MNT_MTX(mp), PVFS,
"mount drain", 0);
+ MNT_IUNLOCK(mp);
+ goto top;
}
MNT_IUNLOCK(mp);
KASSERT(mp->mnt_lockref == 0,
Does this seem like the correct fix to you?
Kirk McKusick
More information about the freebsd-fs
mailing list