NFS locking revisited
Doug Rabson
dfr at rabson.org
Wed Mar 5 01:04:22 PST 2008
Over the last couple of months, I have been working on a complete re-
implementation of the NFS Lock Manager. The new NLM is designed to run
in the kernel environment and uses the kernel's fcntl lock
infrastructure to store its state.
As part of this work, I have augmented the fcntl flock structure to
include an indication of which remote system owns the lock and I have
added some infrastructure to support asynchronous locking (not
currently exposed to userland but required for the NLM). I have also
ported the much of the userland sunrpc code to run in the kernel
environment to make life easier (in my opinion, this is how all our
NFS code should have been done from the start).
Anyone interested in this code can find a snapshot patch at http://people.freebsd.org/~dfr/src-lockd-M5-04032008.diff
, relative to an approximately two month old snapshot of -current. The
current plan is to start committing this work to -current in two or
three weeks time, depending on feedback.
Highlights include:
* Thread-safe kernel RPC client - many threads can use the same RPC
client handle safely with replies being de-multiplexed at the socket
upcall (typically driven directly by the NIC interrupt) and handed off
to whichever thread matches the reply. For UDP sockets, many RPC
clients can share the same socket. This allows the use of a single
privileged UDP port number to talk to an arbitrary number of remote
hosts.
* Single-threaded kernel RPC server. Adding support for multi-threaded
server would be relatively straightforward and would follow
approximately the Solaris KPI. A single thread should be sufficient
for the NLM since it should rarely block in normal operation.
* Kernel mode NLM server supporting cancel requests and granted
callbacks. I've tested the NLM server reasonably extensively - it
passes both my own tests and the NFS Connectathon locking tests
running on Solaris, Mac OS X and Ubuntu Linux. The only current
limitation compared to the userland NLM server is that it doesn't
currently support the command-line arguments that specify what
addresses and port numbers to listen to. This can and will be fixed
soon.
* Userland NLM client supported. While the NLM server doesn't have
support for the local NFS client's locking needs, it does have to
field async replies and granted callbacks from remote NLMs that the
local client has contacted. We relay these replies to the userland
rpc.lockd over a local domain RPC socket.
* IPv6 should be supported but has not been tested since I've been
unable to get IPv6 to work properly with the Parallels virtual
machines that I've been using for development.
* Since both local and remote locks are managed by the same kernel
locking code, local and remote processes can safely use file locks for
mutual exclusion. Local processes have a slight fairness advantage
compared to remote processes when contending to lock a region that has
just been unlocked. This could be avoided by enabling the code
currently hidden behind '#ifdef ADVLOCKASYNC_TESTING' in
kern_descrip.c since that would enforce strict first-come first-served
semantics for both local and remote lockers.
More information about the freebsd-current
mailing list