Stale NFS file handles on 8.x amd64
Rick Macklem
rmacklem at uoguelph.ca
Wed Dec 1 23:33:07 UTC 2010
>
> I'll give negnametimeo=0 a try on one server starting tonight, I'll be
> busy tomorrow and don't want to risk making anything potentially worse
> than it is yet. I can't figure out how to disable the attr cache in
> FreeBSD. Neither suggestions seem to be valid, and years ago when I
> looked into it I got the impression that you can't, but I'd love to be
> proven wrong.
I just looked and, yea, you are correct, in that the cached attributes
are still used while NMODIFIED is set if the mtime isn't within the
current second. (I'm not going to veture a guess as to why this is done
at this time:-)
But, "acregmon=0,acregmax=0,acdirmin=0,acdirmax=0" looks like it comes
close, from a quick inspection of the code. I haven't tested this. You
do have to set both min and max == 0, or max just gets set to min instead
of 0. The *dir* ones apply to directories and the *reg* ones otherwise.
> I'll try dotlock when I can. Would disabling statd and
> lockd be the same as using nolock on all mounts?
Nope. If you kill off lockd and statd without using the "nolock"
option, I think all file lock operations will fail with ENOTSUPPORTED
whereas when you mount with "nolock", the lock ops will be done locally
in the client (ie seen by other processes in the same client, but not
by other clients).
> The vacation binary
> is
> the only thing I can think of that might use it, not sure how well it
> would like missing it which is how I discovered I needed it in the
> first
> place. Also, if disabling lockd shows an improvement, could it lead to
> further investigation or is it just a workaround?
Well, it's a work around in the sense that you are avoiding the NLM and
NSM protocols. These are fundamentally flawed protocol designs imho, but
some folks find that they work ok for them. Imho, the two big flaws are:
1 - Allowing a blocking lock in the server. Then what happens if the client
is network partitioned when the server finally acquires the lock for
the client? (NFSv4 only allows the server to block for a very short
time before it replies. The client must "poll" until the lock is
available, if the client app. allows blocking. In other works, the
client does the blocking.)
2 - It depends upon the NSM to decide if a node is up/down. I'm not
sure what the NSM actually does, but it's along the lines of an IP
broadcast to see if the other host(s) are responding and then sets
up/down based on how recently it saw a message from a given host.
(NFSv4 requires that the server recognize a lock request where the
client had state that predates this boot and reply with an error
that tells the client to recover its lock state using special
variants of the lock ops. Imho, this does a much better job of
making sure the server and clients maintain a consistent set of
lock state. The server may throw away lock state for an NFSv4 client
if it hasn't renewed the state within a lease time and then the client
will be given an "expired" error to tell it that it has lost locks.
This should only happen when a network partitioning exceeds the lease
duration and, in the case of the FreeBSD NFSv4 server, it has also
received a conflicting lock request from another client whose lease
has not expired.)
Probably a lot more glop than you expected, but I couldn't resist a chance
to put in a plug for NFSv4 file locking. Btw, you could try NFSv4 mounts,
since Netapp and the experimental FreeBSD8 client both support them.
Good lock (oh, I meant luck:-) with it, rick
More information about the freebsd-stable
mailing list