Linux NFSv4 clients: bad sequence-id errors.

Ruben mail at osfux.nl
Wed Feb 28 11:45:43 UTC 2018


On 27/02/2018 23:54, Rick Macklem wrote:
> Ruben wrote:
>> I'm experiencing a strange issue on a machine providing a couple of
>> nfsv4 exports. A Linux client that generates a lot of traffic to and
> >from the nfs server sometimes starts throwing "bad sequence-id errors":
>> Feb 27 10:39:42 localhost kernel: [12481477.608103] NFS: v4 server
>> returned a bad sequence-id error on an unconfirmed sequence 80f7d0d0!
> The handling of sequence-id in NFSv4.0 is complex and I won't even try
> to guess why this is happening.
> I am surprised that your Linux mounts are using NFSv4.0 and not NFSv4.1?
> (Usually Linux uses the most recent version supported by the server.)
> I mention this since "sessions" replaced the sequence-id stuff in NFSv4.1
> and, as such, shouldn't have such an issue.
After some digging around I found out that the linux kernel on the
client (raspbian running hexxeh firmware) was compiled with :

CONFIG_NFS_V4=y
# CONFIG_NFS_V4_1 is not set

options regarding nfs v4 functionality.

Thank you for pointing that out, Ill resolve that.
>
>> They typically occur after a couple of months of uptime on the nfsd
>> machine. Every couple of seconds they are thrown by the client. The
>> situation is "remedied" by restarting the nfsd on the server. Although
>> functionality on the specific client does not appear to be affected
>> (much?), its a bit disturbing. I've done some digging and found :
> The fact that this is fixed by restarting the nfsd suggests a client side
> problem.
> Why?
> Because restarting the nfsd does not reset any server state, so the sequence-id
> situation would not be affected by doing this. (To get rid of server side state,
> you must unload the nfsd.ko after killing off the nfsd daemon.)
>
> All restarting the nfsd daemon will do is force the client to establish a new
> TCP connection. That is at a layer below the NFS state.
Thank you for your elaboration!
>
>> https://lists.freebsd.org/pipermail/freebsd-fs/2015-July/021707.html
>>
>> and the patch attached by Rick (  nfsv41exch.patch :
>> http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20150729/586f776a/attachment.bin
>> ) .
>>
>> Since the issue started manifesting itself I have restarted the nfs
>> daemon (grabbed a pcap and the corresponding error lines mentioning the
>> sequences prior to doing that in case anyone is interested).
> If you email me the pcap as an attachment, I can take a look at it in wireshark.
I've sent a download link to my troubleshooting efforts to you. Please
do not look into it purely on my behalf (Im focussing on getting the
client to actually run nfs v4.1 instead of v4.0)!
>
>> The nfs server runs FreeBSD 11.1 :
> I'm being lazy and not looking, but I am almost sure a 2015 patch will be in 11.1
> and probably also in 10.2 and 11.0.

>> I'm wondering: can the 2015 patch provided by Rick still be "safely"
>> applied or has the nfs code changed too much since then? I've witnessed
>> this issue a couple of times now and would very much like to test the
>> patch provided.
> As above, I'd be surprised if the patch isn't already in your 11.1 kernel,
> but you can take a look.
> If it isn't, let me know because that means it slipped through the cracks
> and I need to get it committed, etc.
I'm not at all versed in C but if I find anything ill get back to you.
> rick


Ruben


More information about the freebsd-fs mailing list