fixing "umount -f" for the NFS client

Fri Aug 30 13:00:52 UTC 2013

Kostik wrote:
> On Thu, Aug 29, 2013 at 07:43:34PM -0400, Rick Macklem wrote:
> > Kostik wrote:
> > > On Thu, Aug 29, 2013 at 06:21:41PM -0400, Rick Macklem wrote:
> > > > Kostik wrote:
> > > > > On Wed, Aug 28, 2013 at 08:15:27PM -0400, Rick Macklem wrote:
> > > > > > I've been doing a little more testing of "umount -f" for
> > > > > > NFS
> > > > > > mounts and they seem to be working unless some other
> > > > > > process/thread
> > > > > > has busied the file system via vfs_busy().
> > > > > > 
> > > > > > Unfortunately, it is pretty easy to vfs_busy() the file
> > > > > > system
> > > > > > by using a command like "df" that is stuck on the
> > > > > > unresponsive
> > > > > > NFS server.
> > > > > > 
> > > > > > The problem seems to be that dounmount() msleep()s while
> > > > > > mnt_lockref != 0 before calling VFS_UNMOUNT().
> > > > > > 
> > > > > > If some call into the NFS client was done before this
> > > > > > while (mp->mnt_lockref) loop with msleep() in it, it
> > > > > > can easily kill off RPCs in progress. (It currently
> > > > > > does this in nfs_unmount() using the newnfs_nmcancelreqs()
> > > > > > call.
> > > > > > 
> > > > > > In summary:
> > > > > > - Would it be appropriate to add a new vfs_XXX method that
> > > > > >   dounmount() would call before the while() loop for the
> > > > > >   forced dismount case?
> > > > > >   (The default would be a no-op and I have no idea if any
> > > > > >    file system other than NFS would have a use for it?)
> > > > > >   Alternately, there could be a function pointer set
> > > > > >   non-NULL
> > > > > >   that would specifically be used by the NFS client for
> > > > > >   this.
> > > > > >   This would avoid adding a vfs_XXX() method, but would
> > > > > >   mean
> > > > > >   an NFS specific call ends up in the generic dounmount()
> > > > > >   code.
> > > > > > 
> > > > > > Anyone have comments on this?
> > > > > > 
> > > > > Yes, I do.  I agree with adding the pre-unmount vfs method.
> > > > > This seems to be the cleanest solution possible.
> > > > > 
> > > > I've attached a patch. It is also at
> > > >   http://people.freebsd.org/~rmacklem/forced-dism.patch
> > > > in case the attachment gets lost.
> > > > I don't really like doing the MNT_IUNLOCK(), MNT_ILOCK()
> > > > before/after
> > > > the VFS_KILLIO() call, but I couldn't see any better way to do
> > > > it
> > > > and
> > > > it looks safe to do so, at least for the forced case.
> > > Might be, call it VFS_PURGE() ?
> > > 
> > Sure, any name is fine with me.
> > 
> > > I suggest to move the call to the VFS_KILLIO after the
> > > MNTK_DRAINING
> > > is
> > > set, to avoid getting new references after the current i/o
> > > transactions
> > > are stopped. You would need to set MNTK_DRAINING unconditionally.
> > > Also,
> > > it probably makes sense to replace the if (mnt_lockref) with
> > > while
> > > ().
> > > 
> > Hmm. When I look at the code, the only use of MNTK_DRAINING seems
> > to
> > be to tell vfs_unbusy() to do a wakeup() if mnt_lockref == 0. I
> > don't
> > see why setting it before VFS_PURGE() would matter?
> > 
> > Let me explain what the NFS client does:
> > If is sees MNTK_UNMOUNTF set, it fails VOP/VFS calls without
> > attempting
> > any RPCs. That's why I needed MNTK_UNMOUNTF set before the
> > VFS_PURGE()
> > call. The VFS_PURGE() call causes any RPC that is already in
> > progress to
> > fail (by closing the connection to the server). If there is a case
> > where
> > an RPC attempt can get stuck after this point, it's a bug in the
> > NFS client
> > I will need to find;-)
> > --> Once MNTK_UNMOUNTF is set and VFS_PURGE() is called, all
> > VOP/VFS ops
> >     should return failure without attempting to do RPCs against the
> >     server.
> > If some thread does vfs_busy() while VFS_PURGE() is in progress or
> > before
> > (any time MNT_ILOCK() isn't held) it should end up doing a
> > vfs_unbusy() at
> > some point without getting stuck trying to do an RPC against the
> > server.
> > If this happens before dounmount() re-acquires the MNT_ILOCK(), it
> > should be ok,
> > since mnt_lockref has been decremented.
> > If it does this after the dounmount() thread re-acquires
> > MNT_ILOCK(), dounmount()
> > should be in the msleep() with MNTK_DRAINING set, so it will get
> > the wakeup
> > once mnt_lockref has decremented to 0.
> > 
> > Setting MNTK_DRAINING sooner would just result in the odd
> > unnecessary wakeup(),
> > from what I can see?
> > 
> Hm, I mis-remembered the vfs_busy() code.  Yes, I agree with you.
> 
> > > > 
> > > > I assume I would also need to bump __FreeBSD_version (and maybe
> > > > VFS_VERSION?).
> > > I think you could avoid it.
> > > 
> > Do you mean I don't need to bump __FreeBSD_version or VFS_VERSION
> > or both?
> I do not see much sense in bumping either of them.
> You might want to bump __FreeBSD_version when merging to stable.
> 
Ok, thanks. I'll consider the code reviewed by you unless I here
otherwise from you.

rick