CFR: New NFS Lock Manager
Doug Rabson
dfr at rabson.org
Sat Mar 22 04:51:42 PDT 2008
I've just uploaded a new patch at http://people.freebsd.org/~dfr/lockd-RC2-22032008.diff
. This fixes a serious problem on kernels not compiled with the
LOCKF_DEBUG option (I misplaced a #endif). It also includes minor
fixes to support 64bit architectures and RELENG_7 (the patch does not
apply cleanly to RELENG_7 but does work when you work around the patch
rejects manually).
On 21 Mar 2008, at 10:27, Doug Rabson wrote:
> As I mentioned previously, I have been working on a brand new NFS
> Lock Manager which runs in kernel mode and uses the normal local
> locking infrastructure for its state. I'm currently trying to tie up
> the last few loose ends before committing this work to current. You
> can find a snapshot of this code at http://people.freebsd.org/~dfr/lockd-RC1-20032008.diff
> .
>
> To try it out, take a recent current (I last merged with current on
> 20th March) and apply the patch. Build a kernel with the NFSLOCKD
> option and add '-k' to 'rpc_lockd_flags' in rc.conf. You will need
> to build and install at least a new libc and rpc.lockd.
>
> At this point, it would be useful to get some extra eyes to look
> over my changes. In particular the following:
>
> 1. Choice of syscall number - I found one spare next to the NFS
> syscall and took that. The new syscall is listed in the FBSD_1.1
> namespace, possibly it should be somewhere else.
>
> 2. ABI compatibility - I extended the flock structure by one member
> (adding l_sysid). I have added new operations to fcntl to support
> the new extended structure, leaving the old operations in place to
> work on the old structure. The kernel translates old to new and vice
> versa. No attempt is made to allow a new userland to work with an
> old kernel.
>
> 3. The local lock manager has had a complete rewrite to support
> required features. The new local lock manager supports a more
> flexible model of lock ownership (which can support remote lock
> owners). I have replaced the inadequate deadlock detection code with
> a new (and fast) graph based system. Using the deadlock graph, I was
> able to avoid the 'thundering herd' issues the old lock code had
> when many processes were contending for the same locked region.
> Given the extent of the changes, wider testing and review would be
> extremely welcome.
>
> 4. The NFS lock manager itself is brand new code and as such ought
> to be reviewed. I have also ported the userland sunrpc code to run
> in the kernel environment which may prove useful in future.
>
> Highlights include:
>
> * Thread-safe kernel RPC client - many threads can use the same RPC
> client handle safely with replies being de-multiplexed at the socket
> upcall (typically driven directly by the NIC interrupt) and handed
> off to whichever thread matches the reply. For UDP sockets, many RPC
> clients can share the same socket. This allows the use of a single
> privileged UDP port number to talk to an arbitrary number of remote
> hosts.
>
> * Single-threaded kernel RPC server. Adding support for multi-
> threaded server would be relatively straightforward and would follow
> approximately the Solaris KPI. A single thread should be sufficient
> for the NLM since it should rarely block in normal operation.
>
> * Kernel mode NLM server supporting cancel requests and granted
> callbacks. I've tested the NLM server reasonably extensively - it
> passes both my own tests and the NFS Connectathon locking tests
> running on Solaris, Mac OS X and Ubuntu Linux.
>
> * Userland NLM client supported. While the NLM server doesn't have
> support for the local NFS client's locking needs, it does have to
> field async replies and granted callbacks from remote NLMs that the
> local client has contacted. We relay these replies to the userland
> rpc.lockd over a local domain RPC socket.
>
> * IPv6 should be supported but has not been tested since I've been
> unable to get IPv6 to work properly with the Parallels virtual
> machines that I've been using for development.
>
> * Robust deadlock detection for the local lock manager. In
> particular it will detect deadlocks caused by a lock request that
> covers more than one blocking request. As required by the NLM
> protocol, all deadlock detection happens synchronously - a user is
> guaranteed that if a lock request isn't rejected immediately, the
> lock will eventually be granted. The old system allowed for a
> 'deferred deadlock' condition where a blocked lock request could
> wake up and find that some other deadlock-causing lock owner had
> beaten them to the lock.
>
> * Since both local and remote locks are managed by the same kernel
> locking code, local and remote processes can safely use file locks
> for mutual exclusion. Local processes have no fairness advantage
> compared to remote processes when contending to lock a region that
> has just been unlocked - the local lock manager enforces a strict
> first-come first-served model for both local and remote lockers.
More information about the freebsd-current
mailing list