Major issues with nfsv4

Rick Macklem rmacklem at
Sat Dec 26 23:10:05 UTC 2020

Although you have not posted the value for
vfs.deferred_inact, if that value has become
relatively large when the problem occurs,
it might support this theory w.r.t. how this
could happen.

Two processes in different jails do "stat()" or
similar on the same file in the NFS file system
at basically the same time.
--> They both get shared locked nullfs vnodes,
      both of which hold shared locks on the
      same lowervp (the NFS client one).
--> They both do vput() on these nullfs vnodes

If both call vput_final() concurrently, I think both
could have the VOP_LOCK(vp, LK_UPGRADE | LK_INTERLOCK |
   LK_NOWAIT) at line #3147 fail, since this will call null_lock()
for both nullfs vnodes and then both null_lock() calls will
do VOP_LOCK(lvp, flags); at line #705.
--> The call fails for both processes, since the other one still
      holds the shared lock on the NFS client vnode.

If I have this right, then both processes end up calling
vdefer_inactive() for the upper nullfs vnodes.

If this is what is happening, then when does the VOP_INACTIVE()
get called for the lowervp?

I see vfs_deferred_inactive() in sys/kern/vfs_subr.c, but I do not
know when/how it gets called?

Hopefully Kostik can evaluate/correct this theory?


From: owner-freebsd-fs at <owner-freebsd-fs at> on behalf of Rick Macklem <rmacklem at>
Sent: Wednesday, December 16, 2020 11:25 PM
To: J David; Konstantin Belousov
Cc: freebsd-fs at
Subject: Re: Major issues with nfsv4

If you can do so when the "Opens" count has gone fairly high,
please "sysctl vfs.deferred_inact" and let us know what that


From: J David <j.david.lists at>
Sent: Sunday, December 13, 2020 10:51 PM
To: Konstantin Belousov
Cc: Rick Macklem; freebsd-fs at
Subject: Re: Major issues with nfsv4

CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp at

On Sun, Dec 13, 2020 at 4:25 PM Konstantin Belousov <kostikbel at> wrote:
> Nullfs with -o nocache (default for NFS mounts) should not cache vnodes.
> So it is more likely a local load that has 130k files open.  Of course,
> it is the OP who can answer the question.

This I can rule out; there is no visible correlation between "Opens"
and the number of files open on the system.

Just finishing a test right now, and:

$ sudo nfsstat -E -c | fgrep -A1 OpenOwner
    OpenOwner        Opens    LockOwner        Locks       Delegs     LocalOwn
         4678        36245           15            6            0            0
$ sudo fstat  | wc -l
$ ps Haxlww | wc -l

The value of Opens increases consistently over time.

Killing the processes causing this behavior *did not* reduce the
number of OpenOwner or Opens.

Unmounting the nullfs mounts (after the processes were gone) *did*:

$ sudo nfsstat -E -c | fgrep -A1 OpenOwner
    OpenOwner        Opens    LockOwner        Locks       Delegs     LocalOwn
          130           41            0            0            0            0

Mutex contention was observed this time, but once it was apparent that
"Opens" was increasing over time, I didn't let the test get to the
point of disrupting activities.  This test ended at Opens = 36589,
which is well short of the previous 130,000+.  It is possible that
mutex contention becomes an issue once system CPU resources are

More about the results of the latest test after the data is analyzed.

After that's done, I'll attempt Rick's patch.  In the long run, we
would definitely like to get delegation to work.  Baby steps!

freebsd-fs at mailing list
To unsubscribe, send any mail to "freebsd-fs-unsubscribe at"

More information about the freebsd-fs mailing list