Linux NFSv4 clients are getting (bad sequence-id error!)

Ahmed Kamal email.ahmedkamal at googlemail.com
Wed Jul 1 22:04:57 UTC 2015


Hi all,

*warning*: Sorry I'm cross-posting this from freebsd-fs, things are too
quite there unfortunately

I'm a refugee from linux land. I just set up my first freebsd 10.1 zfs box,
sharing /home over nfs. Since every home directory is its own zfs dataset,
I chose to use nfsv4 to enable recursively sharing/mounting any directory
under /home (I understand nfs4 is a must in this scenario!)

I'm able to mount form linux (rhel5 latest kernel) successfully. Users are
working fine. However every now and then a user screams that his session is
frozen. Usually the processes are stuck in nfs_wait or rpc_* state. I tried
using a much newer linux kernel (3.2 however it still faced the same
problem). The errors in Linux log files are mostly:
Jul  1 17:41:47 mammoth kernel: NFS: v4 server nas  returned a *bad
sequence-id error*!
Jul  1 17:52:32 mammoth kernel: nfs4_reclaim_locks: unhandled error -11.
Zeroing state
Jul  1 17:52:32 mammoth kernel: nfs4_reclaim_open_state: Lock reclaim
failed!

My search led me to (https://access.redhat.com/solutions/1328073) a
detailed analysis of the issue, which you can read over here
https://dl.dropboxusercontent.com/u/51939288/nfs4-bad-seq.pdf .. NetApp
confirmed this was a bug for them (I'm wondering if this is still in
FreeBSD?!)

PS: Right before sending this, I saw dmesg on the freebsd box advising
increasing vfs.nfsd.tcphighwater .. So I up'ed that to 64000. I also up'ed
the number of nfs server threads (-t) from 10 to 60 (we're roughly 40 linux
machines)

Any advice is most appreciated!

Thanks


More information about the freebsd-stable mailing list