wanna solve the Linux NFSv4 client puzzle?

Alan Somers asomers at freebsd.org
Mon May 3 00:35:35 UTC 2021


On Sun, May 2, 2021 at 6:27 PM Rick Macklem <rmacklem at uoguelph.ca> wrote:

> Rick Macklem wrote:
> >Hi,
> >
> >I posted recently that enabling delegations should be avoided at this
> time,
> >especially if your FreeBSD NFS server has Linux client mounts...
> >
> >I thought some of you might be curious why, and I thought it would be
> >more fun if you look for yourselves.
> >To play the game, you need to download a packet capture:
> >fetch https://people.freebsd.org/~rmacklem/twoclientdeleg.pcap
> >and then load it into wireshark.
> >
> >192.168.1.5 - FreeBSD server with all recent patches
> >192.168.1.6 - FedoraCore 30 (Linux 5.2 kernel) client
> >192.168.1.13 - FreeBSD client
> >
> >A few hints buried in RFC5661:
> >- A fore channel is used for normal client->server RPCs and a back channel
> >  is used for server->client callback RPCs.
> >- After a new TCP is created, neither the fore nor back channels
> >  are bound to the connection.
> >- Bindings channel(s) to a connection is done by BindConnectionToSession.
> >  but an implicit binding for the fore channel is created when the first
> RPC
> >  request with a Sequence operation in it is sent on the new TCP
> connection.
> >- A server->client callback cannot be done until the back channel is bound
> >  via BindConnectionToServer.
> >
> >Ok, so we are ready...
> >- Look at packet #s 3518->3605.
> >  - What is going on here?
> Ok, so here's my solution...
> packet #3518, 3520 and 3521 are delegation recalls (CB_RECALL)
> for 3 different delegations on three different session slots.
> time: 137.5
>
> Expected response from the Linux client--> 3 replies to the CB_RECALLs.
> What does it actually do?
> --> Creates a new TCP connection using same port#. You can see it send
>       a FIN (packet# 3523) and a SYN (packet# 3527).
>       This means that the client is no longer obliged to reply to the
> CB_RECALLs
>       and the FreeBSD server will probably need to retry them.
>       --> It also means that no back channel is bound to the session, so
> the
>              server cannot do callbacks (ie. cannot retry the CB_RECALLs
> yet).
>
> packet# 3530 is a Setattr RPC, which has a Sequence operation in it.
> --> This means the fore channel is implicitly bound to the new TCP
>       connection, but no back channel, so the server cannot retry the
> CB_RECALLs.
>
> You will notice a bunch of Setattr RPCs getting NFS4ERR_DELAY replies.
> This tells the Linux client to "try again later".
> --> It happens because the FreeBSD server cannot perform the Setattr
>       until the client returns a delegation.
>       --> That requires a CB_RECALL.
>
> packet# 3582 is a Setattr RPC reply. If you look in the Sequence operation
> reply, you will see the flag SEQ4_STATUS_CB_PATH_DOWN is set.
> --> This is the FreeBSD server telling the Linux client that the callback
> path
>        is down (the back channel is not bound to the new TCP connection).
> Time: 137.6  (took about 0.1sec for the server to notice that the callback
>                      path/back channel is not working).
>
> packet# 3604 Linux client does a BindConnectionToSession to bind the
>        back channel.
> --> This is not permitted by RFC5661, since it is required to be done on
>       the new TCP connection before the implicit binding of the fore
>       channel only, already done by packet# 3530.
> packet# 3605 FreeBSD server violates RFC5661 and allows the binding
>      to be done, so that CB_RECALLs can again be done.
> Time: 152.7
>
>   - How long does this take?
>     152.7 - 137.5 = 15.2seconds
>
> >--> One more hint. Starting with #3605, things are working again.
>       --> Things start working again because the FreeBSD server
>             cheats and allows the BindConnectionToSession to be done.
>             RFC5661 specifies a reply of NFS4ERR_INVAL for this.
>
> >There are actually 3 other examples of this in the pack capture.
> Every time multiple concurrent callbacks are attempted, the Linux
> client "bails out" by creating a new TCP connection.
> --> This is said to be fixed in Linux 5.3, but I haven't tested a newer
>        kernel than 5.2 yet.
>
> >Btw, one of the weirdnesses is said to be fixed in Linux 5.3 and the other
> >in Linux 5.7, although I have not yet upgraded my kernel and tested this.
> The "do BindConnectionToSession after an implicit binding" is said to be
> fixed
> in Linux 5.7, however the fix is not exactly what I would have expected.
> --> I would have expected a BindConnectionToSession to be done right
>       away when a new TCP connection is created.
>       --> Linux 5.7 and newer is said to still wait (15sec?) to do the
>             BindConnectionToSession, but fixes the bug by creating yet
>             another new TCP connection just before doing the
>             BindConnectionToSession RPC.
>       --> A SEQ4_STATUS_CB_PATH_DOWN flag set in a Sequence operation
>             reply is what triggers the BindConnectionToSession and that is
> still
>             required for 5.7 or newer, but I'll need to test to see how
> long it takes
>             for newer kernels?
>
> The old "cheat", which is still in the released server code (recently
> removed
> by a patch in main, stable/12 and stable/13) implicitly bound both the fore
> and back channels. Look for this comment in
> sys/fs/nfsserver/nfs_nfsdstate.c
> in unpatched code...
>         /*
>          * If this session handles the backchannel, save the nd_xprt for
> this
>          * RPC, since this is the one being used.
>          * RFC-5661 specifies that the fore channel will be implicitly
>          * bound by a Sequence operation.  However, since some NFSv4.1
> clients
>          * erroneously assumed that the back channel would be implicitly
>          * bound as well, do the implicit binding unless a
>          * BindConnectiontoSession has already been done on the session.
>          */
>
> --> This worked fine and avoided most of the above craziness, but...
>        (A) It violated RFC5661.
>        and
>        (B) It broke the Linux client badly when the "nconnects" mount
>             option (added fairly recently) was used.
>        --> So I felt I had to get rid of it. (The non-conformance with
>               RFC5661 was reported by redhat.)
>
> Bottom line...unless all your Linux clients are kernel version 5.3 or
> newer,
> avoid enabling delegations in the FreeBSD NFSv4.1/4.2 server.
> --> Even with a completely patched server, you will still get 15second
> pauses
>       every time the server attempts multiple concurrent callbacks.
>
> >Have fun with it, rick
> At least you can now see why I have "fun with it";-) rick
>

Ughh.  I'm glad you figured it out so I didn't have to.  Thanks for all the
hard work, Rick.


More information about the freebsd-stable mailing list