NFS Locking Issue

Robert Watson rwatson at FreeBSD.org
Tue Jul 4 20:47:23 UTC 2006


On Tue, 4 Jul 2006, Scott Long wrote:

> For what it's worth, I recently spent a lot of time putting FreeBSD 6.1 to 
> the test as both an NFS client and server in a mixed OS environment. By far 
> and away, the biggest problems that I encountered with it were due to linux 
> NFS bugs.  CentOS, FC, and SuSE all created huge problems under load, and it 
> was impossible to get stable results until I started using 2.6.12 and higher 
> kernels.
>
> I have a variety of theories that I wish I had had time to test.  I've seen 
> hints of problems with READDIRPLUS, with FreeBSD's habit of mapping GETATTR 
> to ACCESS, and with handle sizes.  But in any case, it's been no secret that 
> Linux has had very severe NFS problems in the past, and that the NetApp 
> folks have worked very hard over the last year to fix them in the most 
> recent Linux kernel releases.  The only real fault I give FreeBSD is 
> rpc.lockd.  It's pretty much useless in all but trivial circumstances. 
> Beyond that, make sure you're using a linux kernel that is relatively 
> recent.

BTW, I noticed yesterday that that IPv6 support committ to rpc.lockd was never 
backed out.  An immediate question for people experiencing new rpc.lockd 
problems with 6.x should be whether or not backing out that change helps.

I set up a simple local testbed for rpc.lockd this morning and have started 
running some basic tests.  I wasn't able to trivially reproduce rpc.lockd 
problems reported for cp -r, although I did bump into another bump in the 
memory mapping of zero-length files following creation in the NFS client, 
which I've passed on to Mohan.

I think what's needed is a wire-level regression suite, though, in order to 
avoid mixing up our rpc.lockd client code with the tests for rpc.lockd's 
server.  This is something I may be able to start looking at this week, 
although it's the usual time trade-off: work on getting audit ready for MFC, 
network stack locking and protocol cleanup/bug fixing, or throw rpc.lockd into 
the mix as well?  If we can demonstrate that backing out the IPv6 change 
clearly helps, we need to figure out why it's causing the problem.  A casual 
read of the change doesn't suggest anything obvious, unfortunately, suggesting 
something non-obvious :-(.

Robert N M Watson
Computer Laboratory
University of Cambridge


More information about the freebsd-stable mailing list