wanna solve the Linux NFSv4 client puzzle?

Rick Macklem rmacklem at uoguelph.ca
Mon May 3 00:27:53 UTC 2021


Rick Macklem wrote:
>Hi,
>
>I posted recently that enabling delegations should be avoided at this time,
>especially if your FreeBSD NFS server has Linux client mounts...
>
>I thought some of you might be curious why, and I thought it would be
>more fun if you look for yourselves.
>To play the game, you need to download a packet capture:
>fetch https://people.freebsd.org/~rmacklem/twoclientdeleg.pcap
>and then load it into wireshark.
>
>192.168.1.5 - FreeBSD server with all recent patches
>192.168.1.6 - FedoraCore 30 (Linux 5.2 kernel) client
>192.168.1.13 - FreeBSD client
>
>A few hints buried in RFC5661:
>- A fore channel is used for normal client->server RPCs and a back channel
>  is used for server->client callback RPCs.
>- After a new TCP is created, neither the fore nor back channels
>  are bound to the connection.
>- Bindings channel(s) to a connection is done by BindConnectionToSession.
>  but an implicit binding for the fore channel is created when the first RPC
>  request with a Sequence operation in it is sent on the new TCP connection.
>- A server->client callback cannot be done until the back channel is bound
>  via BindConnectionToServer.
>
>Ok, so we are ready...
>- Look at packet #s 3518->3605.
>  - What is going on here?
Ok, so here's my solution...
packet #3518, 3520 and 3521 are delegation recalls (CB_RECALL)
for 3 different delegations on three different session slots.
time: 137.5

Expected response from the Linux client--> 3 replies to the CB_RECALLs.
What does it actually do?
--> Creates a new TCP connection using same port#. You can see it send
      a FIN (packet# 3523) and a SYN (packet# 3527).
      This means that the client is no longer obliged to reply to the CB_RECALLs
      and the FreeBSD server will probably need to retry them.
      --> It also means that no back channel is bound to the session, so the
             server cannot do callbacks (ie. cannot retry the CB_RECALLs yet).

packet# 3530 is a Setattr RPC, which has a Sequence operation in it.
--> This means the fore channel is implicitly bound to the new TCP
      connection, but no back channel, so the server cannot retry the CB_RECALLs.

You will notice a bunch of Setattr RPCs getting NFS4ERR_DELAY replies.
This tells the Linux client to "try again later".
--> It happens because the FreeBSD server cannot perform the Setattr
      until the client returns a delegation.
      --> That requires a CB_RECALL.

packet# 3582 is a Setattr RPC reply. If you look in the Sequence operation
reply, you will see the flag SEQ4_STATUS_CB_PATH_DOWN is set.
--> This is the FreeBSD server telling the Linux client that the callback path
       is down (the back channel is not bound to the new TCP connection).
Time: 137.6  (took about 0.1sec for the server to notice that the callback
                     path/back channel is not working).

packet# 3604 Linux client does a BindConnectionToSession to bind the
       back channel.
--> This is not permitted by RFC5661, since it is required to be done on
      the new TCP connection before the implicit binding of the fore
      channel only, already done by packet# 3530.
packet# 3605 FreeBSD server violates RFC5661 and allows the binding
     to be done, so that CB_RECALLs can again be done.
Time: 152.7

  - How long does this take?
    152.7 - 137.5 = 15.2seconds

>--> One more hint. Starting with #3605, things are working again.
      --> Things start working again because the FreeBSD server
            cheats and allows the BindConnectionToSession to be done.
            RFC5661 specifies a reply of NFS4ERR_INVAL for this.

>There are actually 3 other examples of this in the pack capture.
Every time multiple concurrent callbacks are attempted, the Linux
client "bails out" by creating a new TCP connection.
--> This is said to be fixed in Linux 5.3, but I haven't tested a newer
       kernel than 5.2 yet.

>Btw, one of the weirdnesses is said to be fixed in Linux 5.3 and the other
>in Linux 5.7, although I have not yet upgraded my kernel and tested this.
The "do BindConnectionToSession after an implicit binding" is said to be fixed
in Linux 5.7, however the fix is not exactly what I would have expected.
--> I would have expected a BindConnectionToSession to be done right
      away when a new TCP connection is created.
      --> Linux 5.7 and newer is said to still wait (15sec?) to do the
            BindConnectionToSession, but fixes the bug by creating yet
            another new TCP connection just before doing the
            BindConnectionToSession RPC.
      --> A SEQ4_STATUS_CB_PATH_DOWN flag set in a Sequence operation
            reply is what triggers the BindConnectionToSession and that is still
            required for 5.7 or newer, but I'll need to test to see how long it takes
            for newer kernels?

The old "cheat", which is still in the released server code (recently removed
by a patch in main, stable/12 and stable/13) implicitly bound both the fore
and back channels. Look for this comment in sys/fs/nfsserver/nfs_nfsdstate.c
in unpatched code...
	/*
	 * If this session handles the backchannel, save the nd_xprt for this
	 * RPC, since this is the one being used.
	 * RFC-5661 specifies that the fore channel will be implicitly
	 * bound by a Sequence operation.  However, since some NFSv4.1 clients
	 * erroneously assumed that the back channel would be implicitly
	 * bound as well, do the implicit binding unless a
	 * BindConnectiontoSession has already been done on the session.
	 */

--> This worked fine and avoided most of the above craziness, but...
       (A) It violated RFC5661.
       and
       (B) It broke the Linux client badly when the "nconnects" mount
            option (added fairly recently) was used.
       --> So I felt I had to get rid of it. (The non-conformance with
              RFC5661 was reported by redhat.)

Bottom line...unless all your Linux clients are kernel version 5.3 or newer,
avoid enabling delegations in the FreeBSD NFSv4.1/4.2 server.
--> Even with a completely patched server, you will still get 15second pauses
      every time the server attempts multiple concurrent callbacks.

>Have fun with it, rick
At least you can now see why I have "fun with it";-) rick

_______________________________________________
freebsd-stable at freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"


More information about the freebsd-stable mailing list