Linux NFSv4 clients are getting (bad sequence-id error!)

Sun Jul 5 21:01:29 UTC 2015

Hi folks,

Just a quick update. I did not test Xin's patches yet .. What I did so far
is to increase the tcp highwater tunable and increase nfsd threads to 60.
Today (a working day) I noticed I only got one bad sequence error message!
Check this:

# grep 'bad sequence' messages* | awk '{print $1 $2}' | uniq -c
      1 messages:Jul5
     39 messages.1:Jun28
     15 messages.1:Jun29
      4 messages.1:Jun30
      9 messages.1:Jul1
     23 messages.1:Jul2
      1 messages.1:Jul4
      1 messages.2:Jun28

So there seems to be an improvement! Not sure if the Linux nfs4 client is
able to somehow recover from those bad-sequence situations or not .. I did
get some user complaints that running "ls -l" is sometimes slow and takes a
couple of seconds to finish.

One final question .. Do you folks think nfs4.1 is more reliable in general
than nfs4 .. I've always only used nfs3 (I guess it can't work here with
/home/* being separate zfs filesystems) .. So should I go through the pain
of upgrading a few servers to RHEL-6 to try out nfs4.1 ? Basically do you
expect the protocol to be more solid ? I know it's a fluffy question, just
give me your thoughts. Thanks a lot!

On Fri, Jul 3, 2015 at 2:51 AM, Rick Macklem <rmacklem at uoguelph.ca> wrote:

> Ahmed Kamal wrote:
> > PS: Today (after adjusting tcp.highwater) I didn't get any screaming
> > reports from users about hung vnc sessions. So maybe just maybe, linux
> > clients are able to somehow recover from this bad sequence messages. I
> > could still see the bad sequence error message in logs though
> >
> > Why isn't the highwater tunable set to something better by default ? I
> mean
> > this server is certainly not under a high or unusual load (it's only 40
> PCs
> > mounting from it)
> >
> > On Fri, Jul 3, 2015 at 1:15 AM, Ahmed Kamal <
> email.ahmedkamal at googlemail.com
> > > wrote:
> >
> > > Thanks all .. I understand now we're doing the "right thing" ..
> Although
> > > if mounting keeps wedging, I will have to solve it somehow! Either
> using
> > > Xin's patch .. or Upgrading RHEL to 6.x and using NFS4.1.
> > >
> > > Regarding Xin's patch, is it possible to build the patched nfsd code,
> as a
> > > kernel module ? I'm looking to minimize my delta to upstream.
> > >
> Yes, you can build the nfsd as a module. If your kernel config does not
> include
> "options NFSD" the module will get loaded/used. It is also possible to
> replace
> the module without rebooting, but you need to kill of the nfsd daemon then
> kldunload nfsd.ko and replace nfsd.ko with the new one. (In
> /boot/<kernel-name>.)
>
> > > Also would adopting Xin's patch and hiding it behind a
> > > kern.nfs.allow_linux_broken_client be an option (I'm probably not the
> last
> > > person on earth to hit this) ?
> > >
> If it fixes your problem, I think this is reasonable.
> I'm also hoping that someone that works on the Linux client reports
> if/when this
> was changed.
>
> rick
>
> > > Thanks a lot for all the help!
> > >
> > > On Thu, Jul 2, 2015 at 11:53 PM, Rick Macklem <rmacklem at uoguelph.ca>
> > > wrote:
> > >
> > >> Ahmed Kamal wrote:
> > >> > Appreciating the fruitful discussion! Can someone please explain to
> me,
> > >> > what would happen in the current situation (linux client doing this
> > >> > skip-by-1 thing, and freebsd not doing it) ? What is the effect of
> that?
> > >> Well, as you've seen, the Linux client doesn't function correctly
> against
> > >> the FreeBSD server (and probably others that don't support this
> > >> "skip-by-1"
> > >> case).
> > >>
> > >> > What do users see? Any chances of data loss?
> > >> Hmm. Mostly it will cause Opens to fail, but I can't guess what the
> Linux
> > >> client behaviour is after receiving NFS4ERR_BAD_SEQID. You're the guy
> > >> observing
> > >> it.
> > >>
> > >> >
> > >> > Also, I find it strange that netapp have acknowledged this is a bug
> on
> > >> > their side, which has been fixed since then!
> > >> Yea, I think Netapp screwed up. For some reason their server allowed
> this,
> > >> then was fixed to not allow it and then someone decided that was
> broken
> > >> and
> > >> reversed it.
> > >>
> > >> > I also find it strange that I'm the first to hit this :) Is no one
> > >> running
> > >> > nfs4 yet!
> > >> >
> > >> Well, it seems to be slowly catching on. I suspect that the Linux
> client
> > >> mounting a Netapp is the most common use of it. Since it appears that
> they
> > >> flip flopped w.r.t. who's bug this is, it has probably persisted.
> > >>
> > >> It may turn out that the Linux client has been fixed or it may turn
> out
> > >> that most servers allowed this "skip-by-1" even though David Noveck
> (one
> > >> of the main authors of the protocol) seems to agree with me that it
> should
> > >> not be allowed.
> > >>
> > >> It is possible that others have bumped into this, but it wasn't
> isolated
> > >> (I wouldn't have guessed it, so it was good you pointed to the RedHat
> > >> discussion)
> > >> and they worked around it by reverting to NFSv3 or similar.
> > >> The protocol is rather complex in this area and changed completely for
> > >> NFSv4.1,
> > >> so many have also probably moved onto NFSv4.1 where this won't be an
> > >> issue.
> > >> (NFSv4.1 uses sessions to provide exactly once RPC semantics and
> doesn't
> > >> use
> > >>  these seqid fields.)
> > >>
> > >> This is all just mho, rick
> > >>
> > >> > On Thu, Jul 2, 2015 at 1:59 PM, Rick Macklem <rmacklem at uoguelph.ca>
> > >> wrote:
> > >> >
> > >> > > Julian Elischer wrote:
> > >> > > > On 7/2/15 9:09 AM, Rick Macklem wrote:
> > >> > > > > I am going to post to nfsv4 at ietf.org to see what they say.
> Please
> > >> > > > > let me know if Xin Li's patch resolves your problem, even
> though I
> > >> > > > > don't believe it is correct except for the UINT32_MAX case.
> Good
> > >> > > > > luck with it, rick
> > >> > > > and please keep us all in the loop as to what they say!
> > >> > > >
> > >> > > > the general N+2 bit sounds like bullshit to me.. its always N+1
> in a
> > >> > > > number field that has a
> > >> > > > bit of slack at wrap time (probably due to some ambiguity in the
> > >> > > > original spec).
> > >> > > >
> > >> > > Actually, since N is the lock op already done, N + 1 is the next
> lock
> > >> > > operation in order. Since lock ops need to be strictly ordered,
> > >> allowing
> > >> > > N + 2 (which means N + 2 would be done before N + 1) makes no
> sense.
> > >> > >
> > >> > > I think the author of the RFC meant that N + 2 or greater fails,
> but
> > >> it
> > >> > > was poorly worded.
> > >> > >
> > >> > > I will pass along whatever I get from nfsv4 at ietf.org. (There is
> an
> > >> archive
> > >> > > of it somewhere, but I can't remember where.;-)
> > >> > >
> > >> > > rick
> > >> > > _______________________________________________
> > >> > > freebsd-fs at freebsd.org mailing list
> > >> > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > >> > > To unsubscribe, send any mail to "
> freebsd-fs-unsubscribe at freebsd.org"
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>