Major issues with nfsv4
rmacklem at uoguelph.ca
Sat Dec 12 03:41:07 UTC 2020
J David wrote:
>On Fri, Dec 11, 2020 at 6:28 PM Rick Macklem <rmacklem at uoguelph.ca> wrote:
>> I am afraid I know nothing about nullfs and jails. I suspect it will be
>> something related to when file descriptors in the NFS client mount
>> get closed.
>What does NFSv4 do differently than NFSv3 that might upset a low-level
>consumer like nullfs?
The opens for one. When a file is opened it finds its way to VOP_OPEN().
--> For NFSv3 all it does is some client side cache consistency checks.
--> For NFSv4, it must acquire or update a NFSv4 Open, which is a form
of lock that is acquired/updated by an Open operation in an RPC.
Then the client stores this locking info in a structure in a linked list
off of the mount point.
Once all file descriptors for the vnode are closed, then, and only
then can a Close operation be done against the server and the linked
list data structure be free'd.
--> Does having nullfs between the file descriptors and the NFS vnodes
for the same file affect when the v_usecount decrements to 0 on
the NFS vnode?
I don't know. but if it delays it, then these linked list structures
will not be free'd as soon and might accumulate.
--> The more structures the longer the linked list and the more
overhead/cpu will be used prcessing them.
The fact that processes are spending a long time in exit() might
be a hint that there are a large # of these NFSv4 Opens to deal with
when files are being closed implicitly during exit.
As I mentioned, "nfsstat -c -E" will tell you how many Opens there
are under the "OpenOwners ..." line.
>> Well, NFSv3 is not going away any time soon, so if you don't need
>> any of the additional features it offers...
>If we did not want the additional features, we definitely would not be
>> a user would have to run their own custom hacked
>> userland NFS client. Although doable, I have never heard of it being done.
>Alex beat me to libnfs.
And you have users that would want to maliciously access the NFS server
running jobs on this environment? (Other than reverting to NFSv3, allowing
clients to use non-reserved port#s is probably your other choice, from what
I can see. Fixing whatever the interaction between nullfs and the NFSv4 mount
is probably won't be fixed quickly, if ever.)
>What about this as a stopgap measure?
>> How explosive would adding SO_REUSEADDR to the NFS client be? It's
>> not a full solution, but it would handle the TIME_WAIT side of the
>The kernel NFS networking code is confusing to me. I can't even
>figure out where/how NFSv4 binds a client socket to know if it's
>possible. (Pretty sure the code in sys/nfs/krpc_subr.c is not it.)
It's done in the kernel RPC code, found in the sys/rpc directory.
Mostly in clnt_rc.c and clnt_vc.c.
If there is a timeout for an RPC (slow server, network problem,...),
the code in clnt_rc.c will create a new TCP connection. The old
connection could easily still be around.
As such, I do not believe that SO_REUSEADDR or SO_REUSEPORT
More information about the freebsd-fs