Definite NFS bug

Garrett Wollman wollman at csail.mit.edu
Thu Oct 30 22:44:40 UTC 2014


Like many other users, I upgrade my FreeBSD servers by NFS-mounting
/usr/src and /usr/obj from a shared build server.[1]  Since I upgraded
the build server to 9.3, clients running 9.3 kernels have been
randomly erroring out during installkernel and installworld.  Today I
had some time to look more closely into this and found that the error
is definitely coming from the server: at some point, it just randomly
starts returning errors to client ACCESS and GETATTR operations.  The
errors are a mix of NFS3ERR_IO and NFS3ERR_ACCES, but there is nothing
on the server to indicate any kind of error, and restarting the
operation on the client causes it to fail in a different place.  With
enough patients and restarts, it's possible to complete the
installation in just four or five passes.

Needless to say this is a bit worrying.  Strangely, 9.1 and 9.2
clients don't see this issue at all; it's only 9.3 clients that
break.

It's easy to reproduce, just 'cd /usr/sc && find . -type f >/dev/null'.
It does not seem to depend on the client NFS version (3 or 4) or
implementation ("old" or "new").  I haven't tried the "old" server yet
-- I'll need to figure out how to do that first.

If anyone is willing to help debug this, I can share a packet trace,
but I don't think it's very informative.  Also, if anyone has a good
dtrace script that I could run on the server that would report what's
going on when that first NFS3ERR_IO is returned, that would be great.

-GAWollman

[1] I'd run my own freebsd-update server but unfortunately it is too
tied to building things that look like official FreeBSD security
updates, and isn't really designed for (e.g.) updating kernels when we
change a configuration option.  It also doesn't have any obvious knobs
for building with anything other than a default {make,src}.conf.
And with a pkg-able base just around the corner I don't really want to
put much effort into making freebsd-update do what I want.  NFS, on
the other hand, is a big deal and so I need to track down and fix
these bugs.


More information about the freebsd-stable mailing list