LOR in -current NFS client code + possible patch
Don Lewis
truckman at FreeBSD.org
Mon Apr 21 23:20:18 PDT 2003
I had something wedge up in the NFS client code in a recent version of
-current.
5.0-CURRENT #61: Sat Apr 19 00:36:17 PDT 2003
I got the console message:
Apr 21 20:52:07 scratch kernel: nfs server mousie:/home: not responding
TCP connections continued to work, and another client was still able to
access the server, so the problem was definitely in the client code.
When I attempted to kill the process that seemed to be responsible for
wedging NFS, I got a lock order reversal message:
Apr 21 20:54:33 scratch kernel: lock order reversal
Apr 21 20:54:33 scratch kernel: 1st 0xc893ab68 vnode interlock (vnode interlock)
@ /usr/src/sys/nfsclient/nfs_vnops.c:2792
Apr 21 20:54:33 scratch kernel: 2nd 0xc69f4248 process lock (process lock) @ /us
r/src/sys/nfsclient/nfs_socket.c:1239
Apr 21 20:54:33 scratch kernel: Stack backtrace:
The backtrace (copied by hand):
witness_lock()
_mtx_lock_flags()
nfs_sigintr() at nfs_sigintr+0x77
nfs_flush() at nfs_flush+0x763
nfs_close() at nfs_close+0x7a
vn_close()
vn_closefile()
fdrop_locked()
fdrop()
closef()
close()
I don't know what caused the original problem, but the lock order
reversal is caused by nfs_flush() calling nfs_sigintr() while holding a
vnode interlock, and nfs_sigintr() calls PROC_LOCK().
It looks to me like the following patch is the proper fix. There is
another call to nfs_sigintr() in nfs_flush(), but it looks like
BUF_TIMELOCK() must release the interlock in the error case. Comments?
Index: nfs_vnops.c
===================================================================
RCS file: /home/ncvs/src/sys/nfsclient/nfs_vnops.c,v
retrieving revision 1.202
diff -u -r1.202 nfs_vnops.c
--- nfs_vnops.c 31 Mar 2003 23:26:10 -0000 1.202
+++ nfs_vnops.c 22 Apr 2003 06:03:28 -0000
@@ -2838,8 +2842,8 @@
error = msleep((caddr_t)&vp->v_numoutput, VI_MTX(vp),
slpflag | (PRIBIO + 1), "nfsfsync", slptimeo);
if (error) {
+ VI_UNLOCK(vp);
if (nfs_sigintr(nmp, NULL, td)) {
- VI_UNLOCK(vp);
error = EINTR;
goto done;
}
@@ -2847,6 +2851,7 @@
slpflag = 0;
slptimeo = 2 * hz;
}
+ VI_LOCK(vp);
}
}
if (!TAILQ_EMPTY(&vp->v_dirtyblkhd) && commit) {
More information about the freebsd-current
mailing list