Definite NFS bug

Russell L. Carter rcarter at pinyon.org
Fri Oct 31 00:07:41 UTC 2014


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



On 10/30/14 15:44, Garrett Wollman wrote:
> Like many other users, I upgrade my FreeBSD servers by
> NFS-mounting /usr/src and /usr/obj from a shared build server.[1]
> Since I upgraded the build server to 9.3, clients running 9.3
> kernels have been randomly erroring out during installkernel and
> installworld.  Today I had some time to look more closely into this
> and found that the error is definitely coming from the server: at
> some point, it just randomly starts returning errors to client
> ACCESS and GETATTR operations.  The errors are a mix of NFS3ERR_IO
> and NFS3ERR_ACCES, but there is nothing on the server to indicate
> any kind of error, and restarting the operation on the client
> causes it to fail in a different place.  With enough patients and
> restarts, it's possible to complete the installation in just four
> or five passes.
> 
> Needless to say this is a bit worrying.  Strangely, 9.1 and 9.2 
> clients don't see this issue at all; it's only 9.3 clients that 
> break.
> 
> It's easy to reproduce, just 'cd /usr/sc && find . -type f
> >/dev/null'. It does not seem to depend on the client NFS version
> (3 or 4) or implementation ("old" or "new").  I haven't tried the
> "old" server yet -- I'll need to figure out how to do that first.
> 
> If anyone is willing to help debug this, I can share a packet
> trace, but I don't think it's very informative.  Also, if anyone
> has a good dtrace script that I could run on the server that would
> report what's going on when that first NFS3ERR_IO is returned, that
> would be great.

This sounds sort of like what I have been complaining about.
I of course have no competency here but if I build the world
- -j1, I have a much better chance of successful remote installs.
The problems I'm seeing on -current for the last few months
seem to me to be out-of-date targets, so that the failure is a
desire by the remote client to try to rebuild the out-of-date target
on the RO file system.  My new plan is to dump all of the
st_atim and st_mtim for every .depend list on both systems
when I see the problem again, to see if something jumps out.

I just reinstalled everybody with -j1 builds of r273808M, no problems.
Last week however, a fast box failed.  Kind of concerning for
an install to fail say 2/3 through.  I have to admit when
soon after I had a crash on that 2/3 system (on NFS unmount),
I had to step out of the room for the reboot.  Exciting.

I am traveling on Sunday for a week, but I've got a few days to
run things on several big fast 8cpu boxes (my old laptop is
much less afflicted with this problem, though it occasionally
fails too).

Russell


> -GAWollman
> 
> [1] I'd run my own freebsd-update server but unfortunately it is
> too tied to building things that look like official FreeBSD
> security updates, and isn't really designed for (e.g.) updating
> kernels when we change a configuration option.  It also doesn't
> have any obvious knobs for building with anything other than a
> default {make,src}.conf. And with a pkg-able base just around the
> corner I don't really want to put much effort into making
> freebsd-update do what I want.  NFS, on the other hand, is a big
> deal and so I need to track down and fix these bugs. 
> _______________________________________________ 
> freebsd-fs at freebsd.org mailing list 
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs To
> unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBAgAGBQJUUtLDAAoJEFnLrGVSDFaEu0MQAJOlPWcsduuiS75LUe42uj+E
SRnxSvm5JgUdJojatx7cL5TQjEvXbYov8CE8OLZUqGxIi0D0IdpKlr6WJes8KOUC
wfix7doQZQe3IPqgYAJZz0y6j89q6+QABPTS2oy+cPpYmop9568TvuJJZCCixBOF
Zv3XYa4I7uIl1pYF2zl2nJHtOwLi2wjT+851heqXo8GvIo8SAhBouTN5biPh2JGl
Yabbb4e5xePvigMLEwxbPNslv3nhT1JOcsH9GoFLo5zph2+Txw6ZPy1Sccyv88AQ
w5ID129VMzZChX6zYT7+LtJYLmZME3bVrA2R6YeEdnr/Is8qm5eKtpkMrUz+5Qn4
ULf3fJSCjYdlfatfBIFfi2jFJWBkBY7qVu9S5nqfG9yn4DCLY2UYl4skP71Eo4hz
DPDKQwpuij/Tf8y459Vj60AsOt87Sh0eYBnW+nWJdgIPWptYLNmjv/VHvC8ZFbnn
HsrvUw9DovnTfd7rn+GR4F4+nlnjXqOKdPJtLroId3tSxZzy9L08n7Y6AvAWFFWM
oQ4q/B4LxpOmjXqIBTCrC5ux7GdtKGN2gkAYvY4zh3ngPJJ9ts0BRHbq2zRMo9OA
eUT8Cf+D/wQcFcd+27eI1RJu8IbyycStwGMXbA57UkvJkfSA5CVpcey+T5z9uyPa
7xlgxCpHOIHSJ6l2BeSQ
=4Q5V
-----END PGP SIGNATURE-----


More information about the freebsd-fs mailing list