Reboot Loop: ffs_snapshot/bufwait LORs [Was: Re: 8.0-RELEASE hangs with lighttpd, unionfs related? Some traces included]

Harald Schmalzbauer h.schmalzbauer at omnilan.de
Fri Feb 5 11:39:56 UTC 2010


Harald Schmalzbauer schrieb am 05.02.2010 12:31 (localtime):
> Hello,
> 
> when I start lighttpd at boot time, the system half-locks in a way, that 
> any process, which accesses /usr/local/etc stalls. It's also impossible 
> to shut down.
> /usr/local/etc is unionfs mounted.
> I compiled a kernel with debug options.
> 
> When mounting unionfs at boot time, here's the firt LOR with trace:
> 
> lock order reversal:
>  1st 0xffffff00018b47f8 unionfs (unionfs) @ 
> /usr/src/sys/fs/unionfs/union_subr.c:356
>  2nd 0xffffff00018d9d80 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2188
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> _witness_debugger() at _witness_debugger+0x49
> witness_checkorder() at witness_checkorder+0x7ea
> __lockmgr_args() at __lockmgr_args+0xcbd
> ffs_lock() at ffs_lock+0x8c
> VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
> _vn_lock() at _vn_lock+0x50
> vrele() at vrele+0x120
> unionfs_noderem() at unionfs_noderem+0x1c4
> unionfs_reclaim() at unionfs_reclaim+0x11
> vgonel() at vgonel+0xf1
> vrecycle() at vrecycle+0x58
> unionfs_inactive() at uniougen2.2: <vendor 0x1267> at usbus2
> nfs_inactive+ukbd0: 0<vendor 0x1267 product 0x0103, class 0/0, rev 
> 1.10/1.01, addr 2> on usbus2
> x20
> vinactive() at vinactive+0x6b
> vput() at vput+0x216
> kern_statatkbd1 at ukbd0
> _vnhook() at kern_statat_vnhook+0xe9
> kern_statat() at kern_statat+0x15
> stat() at stat+0x22
> syscall() at syscall+0x1af
> Xfast_syscall() at Xfast_syscall+0xe1
> --- suhid0: y<vendor 0x1267 product 0x0103, class 0/0, rev 1.10/1.01, 
> addr 2> on usbus2
> scall (188, FreeBSD ELF64, stat), rip = 0x8009a055c, rsp = 
> 0x7fffffffe5b8, rbp = 0x800b312c0 ---
> KDB: enter: witness_checkorder
> [thread pid 27 tid 100068 ]
> Stopped at      kdb_enter+0x3d: movq    $0,0x4c04dc(%rip)
> Tracing pid 27 tid 100068 td 0xffffff00016fe720
> kdb_enter() at kdb_enter+0x3d
> witness_checkorder() at witness_checkorder+0x7ea
> __lockmgr_args() at __lockmgr_args+0xcbd
> ffs_lock() at ffs_lock+0x8c
> VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
> _vn_lock() at _vn_lock+0x50
> vrele() at vrele+0x120
> unionfs_noderem() at unionfs_noderem+0x1c4
> unionfs_reclaim() at unionfs_reclaim+0x11
> vgonel() at vgonel+0xf1
> vrecycle() at vrecycle+0x58
> unionfs_inactive() at unionfs_inactive+0x20
> vinactive() at vinactive+0x6b
> vput() at vput+0x216
> kern_statat_vnhook() at kern_statat_vnhook+0xe9
> kern_statat() at kern_statat+0x15
> stat() at stat+0x22
> syscall() at syscall+0x1af
> Xfast_syscall() at Xfast_syscall+0xe1
> --- syscall (188, FreeBSD ELF64, stat), rip = 0x8009a055c, rsp = 
> 0x7fffffffe5b8, rbp = 0x800b312c0 -
> 
> 
> Like mentioned, there is that strange problem with lighttpd started at 
> boot time. Other /urs/local/etc/rc.d startups don't lead to a 
> /usr/local/etc deadlock.
> Unfortunately I don't get any panic or anything else when the hang happens.
> How can I aquire more information?
> It's no problem to log in and to do everything else outside 
> /usr/local/etc...
> 
> 
> ======================================================================
> 
> Here's a LOR at shutdown with trace:
> 
> lock order reversal:
>  1st 0xffffff0001bc2098 ufs (ufs) @ /usr/src/sys/kern/vfs_mount.c:1200
>  2nd 0xffffff0001bc1ba8 devfs (devfs) @ 
> /usr/src/sys/ufs/ffs/ffs_vfsops.c:1194
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> _witness_debugger() at _witness_debugger+0x49
> witness_checkorder() at witness_checkorder+0x7ea
> __lockmgr_args() at __lockmgr_args+0xcbd
> vop_stdlock() at vop_stdlock+0x39
> VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
> _vn_lock() at _vn_lock+0x50
> ffs_flushfiles() at ffs_flushfiles+0x93
> ffs_unmount() at ffs_unmount+0x48
> dounmount() at dounmount+0x2ac
> vfs_unmountall() at vfs_unmountall+0x54
> boot() at boot+0x814
> mkdumpheader() at mkdumpheader
> syscall() at syscall+0x1af
> Xfast_syscall() at Xfast_syscall+0xe1
> --- syscall (55, FreeBSD ELF64, reboot), rip = 0x40829c, rsp = 
> 0x7fffffffe738, rbp = 0x402290 ---
> KDB: enter: witness_checkorder
> [thread pid 1 tid 100002 ]
> Stopped at      kdb_enter+0x3d: movq    $0,0x4c04dc(%rip)
> 
> Tracing pid 1 tid 100002 td 0xffffff0001310ab0
> kdb_enter() at kdb_enter+0x3d
> witness_checkorder() at witness_checkorder+0x7ea
> __lockmgr_args() at __lockmgr_args+0xcbd
> vop_stdlock() at vop_stdlock+0x39
> VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
> _vn_lock() at _vn_lock+0x50
> ffs_flushfiles() at ffs_flushfiles+0x93
> ffs_unmount() at ffs_unmount+0x48
> dounmount() at dounmount+0x2ac
> vfs_unmountall() at vfs_unmountall+0x54
> boot() at boot+0x814
> mkdumpheader() at mkdumpheader
> syscall() at syscall+0x1af
> Xfast_syscall() at Xfast_syscall+0xe1
> --- syscall (55, FreeBSD ELF64, reboot), rip = 0x40829c, rsp = 
> 0x7fffffffe738, rbp = 0x402290 ---
> 
> Any Help highly appreciated!
> 
> Thanks,
> 
> -Harry

Additional LORs while regular machine operation (background fsck) which 
leads to reboot!
I have access over the serail console, but the machine is unresponsive 
after that. So I'm now in a endelss reboot loop with the debug kernel...

lock order reversal:
  1st 0xffffff0001b899d0 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:423
  2nd 0xffffff802970fc28 bufwait (bufwait) @ 
/usr/src/sys/kern/vfs_bio.c:2559
  3rd 0xffffff00018b4448 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:544
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x49
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
ffs_snapshot() at ffs_snapshot+0x1b70
ffs_mount() at ffs_mount+0x651
vfs_donmount() at vfs_donmount+0xcd4
nmount() at nmount+0x74
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007acfec, rsp = 
0x7fffffffe988, rbp = 0x800a028e-
KDB: enter: witness_checkorder
[thread pid 947 tid 100073 ]
Stopped at      kdb_enter+0x3d: movq    $0,0x4c04dc(%rip)
db> lock order reversal:
  1st 0xffffff00018b4470 vnode interlock (vnode interlock) @ 
/usr/src/sys/ufs/ffs/ffs_snapshot.c:523
  2nd 0xffffff8000422028 uhci2 (uhci2) @ 
/usr/src/sys/dev/usb/controller/uhci.c:1551
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x49
witness_checkorder() at witness_checkorder+0x7ea
_mtx_lock_flags() at _mtx_lock_flags+0x68
uhci_do_poll() at uhci_do_poll+0x2e
usbd_transfer_poll() at usbd_transfer_poll+0x18d
ukbd_do_poll() at ukbd_do_poll+0x63
ukbd_get_key() at ukbd_get_key+0xa8
ukbd_read_char() at ukbd_read_char+0xaa
scgetc() at scgetc+0x5b
sc_cngetc() at sc_cngetc+0xf2
cncheckc() at cncheckc+0x65
cngetc() at cngetc+0x1c
db_readline() at db_readline+0x79
db_read_line() at db_read_line+0x15
db_command_loop() at db_command_loop+0x38
db_trap() at db_trap+0x87
kdb_trap() at kdb_trap+0x82
trap() at trap+0x18f
calltrap() at calltrap+0x8
--- trap 0x3, rip = 0xffffffff80381141, rsp = 0xffffff803e959000, rbp = 
0xffffff803e959020 ---
kdb_enter() at kdb_enter+0x3d
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
ffs_snapshot() at ffs_snapshot+0x1b70
ffs_mount() at ffs_mount+0x651
vfs_donmount() at vfs_donmount+0xcd4
nmount() at nmount+0x74
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007acfec, rsp = 
0x7fffffffe988, rbp = 0x800a028e-
lock order reversal:
  1st 0xffffff00018b4470 vnode interlock (vnode interlock) @ 
/usr/src/sys/ufs/ffs/ffs_snapshot.c:523
  2nd 0xffffff0001747890 USB device mutex (USB device mutex) @ 
/usr/src/sys/dev/usb/usb_device.c:1410
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x49
witness_checkorder() at witness_checkorder+0x7ea
_mtx_lock_flags() at _mtx_lock_flags+0x68
usbd_clear_stall_proc() at usbd_clear_stall_proc+0x49
usbd_transfer_poll() at usbd_transfer_poll+0x1c0
ukbd_do_poll() at ukbd_do_poll+0x63
ukbd_get_key() at ukbd_get_key+0xa8
ukbd_read_char() at ukbd_read_char+0xaa
scgetc() at scgetc+0x5b
sc_cngetc() at sc_cngetc+0xf2
cncheckc() at cncheckc+0x65
cngetc() at cngetc+0x1c
db_readline() at db_readline+0x79
db_read_line() at db_read_line+0x15
db_command_loop() at db_command_loop+0x38
db_trap() at db_trap+0x87
kdb_trap() at kdb_trap+0x82
trap() at trap+0x18f
calltrap() at calltrap+0x8
--- trap 0x3, rip = 0xffffffff80381141, rsp = 0xffffff803e959000, rbp = 
0xffffff803e959020 ---
kdb_enter() at kdb_enter+0x3d
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
ffs_snapshot() at ffs_snapshot+0x1b70
ffs_mount() at ffs_mount+0x651
vfs_donmount() at vfs_donmount+0xcd4
nmount() at nmount+0x74
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007acfec, rsp = 
0x7fffffffe988, rbp = 0x800a028e-
lock order reversal: (Giant after non-sleepable)
  1st 0xffffff00018b4470 vnode interlock (vnode interlock) @ 
/usr/src/sys/ufs/ffs/ffs_snapshot.c:523
  2nd 0xffffffff80820780 Giant (Giant) @ 
/usr/src/sys/dev/usb/usb_transfer.c:1952
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x49
witness_checkorder() at witness_checkorder+0x7ea
_mtx_lock_flags() at _mtx_lock_flags+0x68
usb_callback_proc() at usb_callback_proc+0x48
usbd_transfer_poll() at usbd_transfer_poll+0x1e9
ukbd_do_poll() at ukbd_do_poll+0x63
ukbd_get_key() at ukbd_get_key+0xa8
ukbd_read_char() at ukbd_read_char+0xaa
scgetc() at scgetc+0x5b
sc_cngetc() at sc_cngetc+0xf2
cncheckc() at cncheckc+0x65
cngetc() at cngetc+0x1c
db_readline() at db_readline+0x79
db_read_line() at db_read_line+0x15
db_command_loop() at db_command_loop+0x38
db_trap() at db_trap+0x87
kdb_trap() at kdb_trap+0x82
trap() at trap+0x18f
calltrap() at calltrap+0x8
--- trap 0x3, rip = 0xffffffff80381141, rsp = 0xffffff803e959000, rbp = 
0xffffff803e959020 ---
kdb_enter() at kdb_enter+0x3d
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
ffs_snapshot() at ffs_snapshot+0x1b70
ffs_mount() at ffs_mount+0x651
vfs_donmount() at vfs_donmount+0xcd4
nmount() at nmount+0x74
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007acfec, rsp = 
0x7fffffffe988, rbp = 0x800a028e-

lock order reversal:
  1st 0xffffff802970fc28 bufwait (bufwait) @ 
/usr/src/sys/kern/vfs_bio.c:2559
  2nd 0xffffff00440ef6b0 snaplk (snaplk) @ 
/usr/src/sys/ufs/ffs/ffs_snapshot.c:793
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x49
witness_checkorder() at witness_checkorder+0x7ea
__lockmgr_args() at __lockmgr_args+0xcbd
ffs_lock() at ffs_lock+0x8c
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x50
ffs_snapshot() at ffs_snapshot+0x17d7
ffs_mount() at ffs_mount+0x651
vfs_donmount() at vfs_donmount+0xcd4
nmount() at nmount+0x74
syscall() at syscall+0x1af
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007acfec, rsp = 
0x7fffffffe988, rbp = 0x800a028e-
KDB: enter: witness_checkorder
[thread pid 947 tid 100073 ]
Stopped at      kdb_enter+0x3d: movq    $0,0x4c04dc(%rip)

Thanks for any help,

-Harry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: OpenPGP digital signature
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20100205/400bc058/signature.pgp


More information about the freebsd-stable mailing list