wanna solve the Linux NFSv4 client puzzle?
Alan Somers
asomers at freebsd.org
Mon May 3 00:35:35 UTC 2021
On Sun, May 2, 2021 at 6:27 PM Rick Macklem <rmacklem at uoguelph.ca> wrote:
> Rick Macklem wrote:
> >Hi,
> >
> >I posted recently that enabling delegations should be avoided at this
> time,
> >especially if your FreeBSD NFS server has Linux client mounts...
> >
> >I thought some of you might be curious why, and I thought it would be
> >more fun if you look for yourselves.
> >To play the game, you need to download a packet capture:
> >fetch https://people.freebsd.org/~rmacklem/twoclientdeleg.pcap
> >and then load it into wireshark.
> >
> >192.168.1.5 - FreeBSD server with all recent patches
> >192.168.1.6 - FedoraCore 30 (Linux 5.2 kernel) client
> >192.168.1.13 - FreeBSD client
> >
> >A few hints buried in RFC5661:
> >- A fore channel is used for normal client->server RPCs and a back channel
> > is used for server->client callback RPCs.
> >- After a new TCP is created, neither the fore nor back channels
> > are bound to the connection.
> >- Bindings channel(s) to a connection is done by BindConnectionToSession.
> > but an implicit binding for the fore channel is created when the first
> RPC
> > request with a Sequence operation in it is sent on the new TCP
> connection.
> >- A server->client callback cannot be done until the back channel is bound
> > via BindConnectionToServer.
> >
> >Ok, so we are ready...
> >- Look at packet #s 3518->3605.
> > - What is going on here?
> Ok, so here's my solution...
> packet #3518, 3520 and 3521 are delegation recalls (CB_RECALL)
> for 3 different delegations on three different session slots.
> time: 137.5
>
> Expected response from the Linux client--> 3 replies to the CB_RECALLs.
> What does it actually do?
> --> Creates a new TCP connection using same port#. You can see it send
> a FIN (packet# 3523) and a SYN (packet# 3527).
> This means that the client is no longer obliged to reply to the
> CB_RECALLs
> and the FreeBSD server will probably need to retry them.
> --> It also means that no back channel is bound to the session, so
> the
> server cannot do callbacks (ie. cannot retry the CB_RECALLs
> yet).
>
> packet# 3530 is a Setattr RPC, which has a Sequence operation in it.
> --> This means the fore channel is implicitly bound to the new TCP
> connection, but no back channel, so the server cannot retry the
> CB_RECALLs.
>
> You will notice a bunch of Setattr RPCs getting NFS4ERR_DELAY replies.
> This tells the Linux client to "try again later".
> --> It happens because the FreeBSD server cannot perform the Setattr
> until the client returns a delegation.
> --> That requires a CB_RECALL.
>
> packet# 3582 is a Setattr RPC reply. If you look in the Sequence operation
> reply, you will see the flag SEQ4_STATUS_CB_PATH_DOWN is set.
> --> This is the FreeBSD server telling the Linux client that the callback
> path
> is down (the back channel is not bound to the new TCP connection).
> Time: 137.6 (took about 0.1sec for the server to notice that the callback
> path/back channel is not working).
>
> packet# 3604 Linux client does a BindConnectionToSession to bind the
> back channel.
> --> This is not permitted by RFC5661, since it is required to be done on
> the new TCP connection before the implicit binding of the fore
> channel only, already done by packet# 3530.
> packet# 3605 FreeBSD server violates RFC5661 and allows the binding
> to be done, so that CB_RECALLs can again be done.
> Time: 152.7
>
> - How long does this take?
> 152.7 - 137.5 = 15.2seconds
>
> >--> One more hint. Starting with #3605, things are working again.
> --> Things start working again because the FreeBSD server
> cheats and allows the BindConnectionToSession to be done.
> RFC5661 specifies a reply of NFS4ERR_INVAL for this.
>
> >There are actually 3 other examples of this in the pack capture.
> Every time multiple concurrent callbacks are attempted, the Linux
> client "bails out" by creating a new TCP connection.
> --> This is said to be fixed in Linux 5.3, but I haven't tested a newer
> kernel than 5.2 yet.
>
> >Btw, one of the weirdnesses is said to be fixed in Linux 5.3 and the other
> >in Linux 5.7, although I have not yet upgraded my kernel and tested this.
> The "do BindConnectionToSession after an implicit binding" is said to be
> fixed
> in Linux 5.7, however the fix is not exactly what I would have expected.
> --> I would have expected a BindConnectionToSession to be done right
> away when a new TCP connection is created.
> --> Linux 5.7 and newer is said to still wait (15sec?) to do the
> BindConnectionToSession, but fixes the bug by creating yet
> another new TCP connection just before doing the
> BindConnectionToSession RPC.
> --> A SEQ4_STATUS_CB_PATH_DOWN flag set in a Sequence operation
> reply is what triggers the BindConnectionToSession and that is
> still
> required for 5.7 or newer, but I'll need to test to see how
> long it takes
> for newer kernels?
>
> The old "cheat", which is still in the released server code (recently
> removed
> by a patch in main, stable/12 and stable/13) implicitly bound both the fore
> and back channels. Look for this comment in
> sys/fs/nfsserver/nfs_nfsdstate.c
> in unpatched code...
> /*
> * If this session handles the backchannel, save the nd_xprt for
> this
> * RPC, since this is the one being used.
> * RFC-5661 specifies that the fore channel will be implicitly
> * bound by a Sequence operation. However, since some NFSv4.1
> clients
> * erroneously assumed that the back channel would be implicitly
> * bound as well, do the implicit binding unless a
> * BindConnectiontoSession has already been done on the session.
> */
>
> --> This worked fine and avoided most of the above craziness, but...
> (A) It violated RFC5661.
> and
> (B) It broke the Linux client badly when the "nconnects" mount
> option (added fairly recently) was used.
> --> So I felt I had to get rid of it. (The non-conformance with
> RFC5661 was reported by redhat.)
>
> Bottom line...unless all your Linux clients are kernel version 5.3 or
> newer,
> avoid enabling delegations in the FreeBSD NFSv4.1/4.2 server.
> --> Even with a completely patched server, you will still get 15second
> pauses
> every time the server attempts multiple concurrent callbacks.
>
> >Have fun with it, rick
> At least you can now see why I have "fun with it";-) rick
>
Ughh. I'm glad you figured it out so I didn't have to. Thanks for all the
hard work, Rick.
More information about the freebsd-stable
mailing list