LOR on 10.2-REL-p8 z/ ZFS + NFS?

Sat Jan 9 15:30:54 UTC 2016

Adrian Chadd wrote:
> Hiya,
> 
> Someone asked me about this. It's happening on a 10.2-REL-p8 box
> serving ZFS via NFS.
> 
> lock order reversal:
>  1st 0xfffff801ed92ca28 zfs (zfs) @
> /usr/src/sys/fs/nfsserver/nfs_nfsdsocket.c:967
>  2nd 0xfffff8015f98a068 ufs (ufs) @ /usr/src/sys/kern/vfs_vnops.c:534
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfffffe0467b8bb30
> kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe0467b8bbe0
> witness_checkorder() at witness_checkorder+0xe24/frame 0xfffffe0467b8bc70
> __lockmgr_args() at __lockmgr_args+0x9d9/frame 0xfffffe0467b8bdb0
> ffs_lock() at ffs_lock+0x92/frame 0xfffffe0467b8be00
> VOP_LOCK1_APV() at VOP_LOCK1_APV+0xfc/frame 0xfffffe0467b8be30
> _vn_lock() at _vn_lock+0xd2/frame 0xfffffe0467b8bea0
> vn_rdwr() at vn_rdwr+0x1c1/frame 0xfffffe0467b8bf80
> nfsrv_writestable() at nfsrv_writestable+0xbd/frame 0xfffffe0467b8bff0
> nfsrv_openupdate() at nfsrv_openupdate+0x557/frame 0xfffffe0467b8c480
> nfsrvd_openconfirm() at nfsrvd_openconfirm+0x175/frame 0xfffffe0467b8c560
> nfsrvd_dorpc() at nfsrvd_dorpc+0xf66/frame 0xfffffe0467b8c720
> nfssvc_program() at nfssvc_program+0x4e6/frame 0xfffffe0467b8c8d0
> svc_run_internal() at svc_run_internal+0xbb7/frame 0xfffffe0467b8ca60
> svc_thread_start() at svc_thread_start+0xb/frame 0xfffffe0467b8ca70
> fork_exit() at fork_exit+0x84/frame 0xfffffe0467b8cab0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0467b8cab0
> --- trap 0xc, rip = 0x80089242a, rsp = 0x7fffffffe5d8, rbp = 0x7fffffffe880
> ---
> 
> Any ideas?
> 
This shouldn't cause a deadlock in practice. The second one is locking
a vnode for a single file used exclusively by the nfs server for NFSv4
called "stablerestart". It shouldn't normally be on an exported volume
and shouldn't be accessed by anything other than the nfsd. (It is only
updated when a new NFSv4 client creates a clientid, which basically
means "once per client" or "once per client mount" depending on the client.)

Also, for the above, it exists on a UFS volume, so there is zero chance
of a deadlock against a vnode on ZFS.

I suppose if "stablerestart" existed on an exported volume and was accessed
erroneously by a client as the first file opened after mounting, a deadlock
is conceivable.
--> I'll take a look and maybe the code can check to see if the nfsd thread
    already has a lock on the vnode to avoid this unlikely scenario.

In summary, I may commit a fix for this someday, but I don't think it
will ever cause a deadlock in practice.

Thanks for reporting it, rick

> 
> 
> -a
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
>