panic/lock on 9.3-RELEASE with nullfs/nfs/zfs combination

Konstantin Belousov kostikbel at gmail.com
Thu Jul 24 16:59:23 UTC 2014


On Thu, Jul 24, 2014 at 05:42:43PM +0200, Harald Schmalzbauer wrote:
>  Hello,
> 
> I'm running 9.3-amd64 with some zfilesystems and a jail.
> 
> One zfilesystem is nullfs_mounted into jail.
> 
> Now I can export (nfsv4) that nullfs_mounted filesystem and rw-opening a
> file inside the jail from the nullfs_mounted fs works, until a client
> walks into nfs_mounted filesystem (just listing directory contents e.g.).
> So mount shows like this:
> 
> tank/my/fs15 mounted on /zfs/netshares/fs15 (zfs, NFS exported, local,
> noatime, noexec, nosuid, nfsv4acls)
> /zfs/netshares/fs15 on /.JAIL/usr/ports (nullfs, local)
> 
> 
> When I the try to open a file (rw) inside the jail from the
> nullfs_mounted filesystem, 9.3-RELEASE blocks any IO completely on that
> filesystem (local or remote),
> with debug-kernel I get the following panic on the nfs/jail server:
> 
> panic: LK_RETRY set with incompatible flags (0x200400) or an error
> occured (11)
> cpuid = 3
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a/frame
> 0xffffff82e54bcc70
> kdb_backtrace() at kdb_backtrace+0x37/frame 0xffffff82e54bcd30
> panic() at panic+0x1cd/frame 0xffffff82e54bce30
> _vn_lock() at _vn_lock+0x67/frame 0xffffff82e54bce90
> zfs_lookup() at zfs_lookup+0x420/frame 0xffffff82e54bcf20
> zfs_freebsd_lookup() at zfs_freebsd_lookup+0xa6/frame 0xffffff82e54bd070
> VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0xd8/frame 0xffffff82e54bd0a0
> vfs_cache_lookup() at vfs_cache_lookup+0xff/frame 0xffffff82e54bd110
> VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0xd8/frame 0xffffff82e54bd140
> null_lookup() at null_lookup+0x92/frame 0xffffff82e54bd1c0
> VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0xd8/frame 0xffffff82e54bd1f0
> lookup() at lookup+0x389/frame 0xffffff82e54bd290
> namei() at namei+0x3df/frame 0xffffff82e54bd340
> vn_open_cred() at vn_open_cred+0x1e2/frame 0xffffff82e54bd4b0
> vop_stdvptocnp() at vop_stdvptocnp+0x1af/frame 0xffffff82e54bd7e0
> null_vptocnp() at null_vptocnp+0xf5/frame 0xffffff82e54bd850
> VOP_VPTOCNP_APV() at VOP_VPTOCNP_APV+0xdb/frame 0xffffff82e54bd880
> vn_vptocnp_locked() at vn_vptocnp_locked+0x15b/frame 0xffffff82e54bd910
> vn_fullpath1() at vn_fullpath1+0x100/frame 0xffffff82e54bd970
> kern___getcwd() at kern___getcwd+0xd4/frame 0xffffff82e54bd9d0
> amd64_syscall() at amd64_syscall+0x318/frame 0xffffff82e54bdaf0
> Xfast_syscall() at Xfast_syscall+0xf7/frame 0xffffff82e54bdaf0
> --- syscall (326, FreeBSD ELF64, sys___getcwd), rip = 0x8011a191c, rsp =
> 0x7fffffffe658, rbp = 0x801873400 ---
> KDB: enter: panic
> [ thread pid 1905 tid 100856 ]
> Stopped at kdb_enter+0x3b: movq $0,0x642172(%rip)
> 
> Like mentioned, this panic happens only if a nfs(v4) client visits fs15
> (the exported and nullfs_mounted fs) and I try to rw-open any file on
> the nullfs afterwards!!!
> 
> How can I provide useful info with KDB? I don't have a dumpdev available
> in that machine???
> http://www.es.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.html
> seems not applicaple, no /var/crash/?*???
> 

The lockmgr flags are LK_SHARE | LK_RETRY, and error 11 == EDEADLK
indicates that the lock is already taken by the curthread in the
exclusive mode. I am interested in what line of code did the locking.

Add ddb, INVARIANTS, WITNESS and DEBUG_VFS_LOCKS options to the kernel
config, reproduce the issue and, after the panic occured and you
get at the ddb prompt, issue command 'show alllocks'.

Also, do 'show mount', after which do 'show mount <addr>', where <addr>
is the address of your nullfs mount point, printed by 'show mount'.

I need all console output starting from the panic message.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20140724/ea233258/attachment.sig>


More information about the freebsd-stable mailing list