Major issues with nfsv4

Konstantin Belousov kostikbel at gmail.com
Sun Dec 27 15:50:36 UTC 2020


On Sat, Dec 26, 2020 at 11:10:01PM +0000, Rick Macklem wrote:
> Although you have not posted the value for
> vfs.deferred_inact, if that value has become
> relatively large when the problem occurs,
> it might support this theory w.r.t. how this
> could happen.
> 
> Two processes in different jails do "stat()" or
> similar on the same file in the NFS file system
> at basically the same time.
> --> They both get shared locked nullfs vnodes,
>       both of which hold shared locks on the
>       same lowervp (the NFS client one).
> --> They both do vput() on these nullfs vnodes
>       concurrently.
> 
> If both call vput_final() concurrently, I think both
> could have the VOP_LOCK(vp, LK_UPGRADE | LK_INTERLOCK |
>    LK_NOWAIT) at line #3147 fail, since this will call null_lock()
> for both nullfs vnodes and then both null_lock() calls will
> do VOP_LOCK(lvp, flags); at line #705.
> --> The call fails for both processes, since the other one still
>       holds the shared lock on the NFS client vnode.
> 
> If I have this right, then both processes end up calling
> vdefer_inactive() for the upper nullfs vnodes.
> 
> If this is what is happening, then when does the VOP_INACTIVE()
> get called for the lowervp?
> 
> I see vfs_deferred_inactive() in sys/kern/vfs_subr.c, but I do not
> know when/how it gets called?
Right, vfs_deferred_inactive() is one way which tries to handle missed
inactivations. If upon vput() the lock is only shared and upgrade
failed, vnode is marked as VI_OWEINACT and put onto 'lazy' list,
processed by vfs_sync(MNT_LAZY). It is typically called from syncer,
which means each 60 secs. There, if the vnode is still unreferenced, it
is inactivated.

Another place where inactivation can occur is reclamation. There in
vgonel(), we call VOP_INACTIVE() if VI_OWEINACT is set. In principle,
this is redundand because correct filesystem must do the same cleanup
(and more) at reclamation as at the inactivation.  But we also call
VOP_CLOSE(FNONBLOCK) before VOP_RECLAIM().

Looking at this from another angle, if inactivation for NFSv4 vnodes
is not called longer than 2 minutes, perhaps there is a reference leak.
It is not due to VFS forgetting about due VOP_INACTIVE() call.


More information about the freebsd-fs mailing list