kern/163770: LOR between zfs&syncer + vnlru leading to ZFS deadlock
Boris Lytochkin
lytboris at yandex-team.ru
Mon Jan 2 08:20:09 UTC 2012
>Number: 163770
>Category: kern
>Synopsis: LOR between zfs&syncer + vnlru leading to ZFS deadlock
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Mon Jan 02 08:20:08 UTC 2012
>Closed-Date:
>Last-Modified:
>Originator: Boris Lytochkin
>Release: RELENG_8
>Organization:
Yandex
>Environment:
FreeBSD skylla.yandex.net 8.2-STABLE FreeBSD 8.2-STABLE #7: Tue Dec 27 19:54:33 MSK 2011 lytboris at skylla.yandex.net:/usr/obj/usr/src/sys/CACTI amd64
>Description:
Deadlocks are seen periodically, most of them can be triggered with low kern.maxvnodes and running /etc/periodic/security/100.chksetuid (a big find in it).
LOR itself as seen couple of minutes after server boots into multiuser:
--- syscall (5, FreeBSD ELF64, open), rip = 0x800f8666c, rsp = 0x7fffffffe8d8, rbp = 0x1b0 ---
lock order reversal:
1st 0xffffff00508c1098 syncer (syncer) @ /usr/src/sys/kern/vfs_subr.c:1737
2nd 0xffffff108f8abba8 zfs (zfs) @ /usr/src/sys/kern/vfs_subr.c:2137
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
_witness_debugger() at _witness_debugger+0x2c
witness_checkorder() at witness_checkorder+0x651
__lockmgr_args() at __lockmgr_args+0xb98
vop_stdlock() at vop_stdlock+0x39
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x52
_vn_lock() at _vn_lock+0x47
vget() at vget+0x56
vfs_msync() at vfs_msync+0xa5
sync_fsync() at sync_fsync+0x12a
VOP_FSYNC_APV() at VOP_FSYNC_APV+0x4a
sync_vnode() at sync_vnode+0x157
sched_sync() at sched_sync+0x1b1
fork_exit() at fork_exit+0x11d
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff9b0b285d00, rbp = 0 ---
When deadlock appears, syncer gets stuck in zilog sync:
procstat: 20 100179 vnlru - mi_switch sleepq_timedwait _sleep zfs_zget zfs_get_data zil_commit zfs_freebsd_write VOP_WRITE_APV vnode_pager_generic_putpages VOP_PUTPAGES_APV vnode_pager_putpages vm_pageout_flush vm_object_page_collect_flush vm_object_page_clean vm_object_terminate vnode_destroy_vobject zfs_freebsd_reclaim VOP_RECLAIM_APV
procstat: 21 100180 syncer - mi_switch sleepq_wait _cv_wait zil_commit zfs_sync sync_fsync VOP_FSYNC_APV sync_vnode sched_sync fork_exit fork_trampoline
debug.procstat:63429 101808 rrdtool - mi_switch sleepq_catch_signals sleepq_wait_sig _sleep pipe_read dofileread kern_readv read syscallenter syscall Xfast_syscall
procstat:63430 101562 rrdtool - mi_switch sleepq_catch_signals sleepq_wait_sig _sleep pipe_read dofileread kern_readv read syscallenter syscall Xfast_syscall
ps: 0 20 0 0 46 0 0 16 zcolli DL ?? 1:36.89 [vnlru]
ps: 0 21 0 0 51 0 0 16 zilog- DL ?? 151:27.35 [syncer]
ps: 80 63429 63419 0 48 0 13280 2032 piperd I ?? 2:36.48 /usr/local/bin/rrdtool -
ps: 80 63430 63419 0 76 0 13280 1712 piperd I ?? 0:00.01 /usr/local/bin/rrdtool -
At this moment these locks are held:
db> show alllocks
Process 69468 (sshd) thread 0xffffff0f49769460 (102462)
exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xffffff131329a0f8) locked @ /usr/src/sys/kern/uipc_sockbuf.c:148
Process 63419 (php) thread 0xffffff115a574000 (102314)
shared lockmgr zfs (zfs) r = 0 (0xffffff0e7822e448) locked @ /usr/src/sys/kern/vfs_subr.c:2137
Process 21 (syncer) thread 0xffffff0027d96000 (100180)
exclusive lockmgr syncer (syncer) r = 0 (0xffffff0045d94098) locked @ /usr/src/sys/kern/vfs_subr.c:1737
Process 20 (vnlru) thread 0xffffff0027d96460 (100179)
exclusive lockmgr zfs (zfs) r = 0 (0xffffff140637f620) locked @ /usr/src/sys/kern/vfs_subr.c:2249
Excerpt from vfs_subr.c:
1732:
vdrop(vp);
mtx_lock(&sync_mtx);
return (*bo == LIST_FIRST(slp));
}
vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
(void) VOP_FSYNC(vp, MNT_LAZY, td);
VOP_UNLOCK(vp, 0);
vn_finished_write(mp);
BO_LOCK(*bo);
if (((*bo)->bo_flag & BO_ONWORKLST) != 0) {
2132:
if ((flags & LK_INTERLOCK) == 0)
VI_LOCK(vp);
vholdl(vp);
if ((error = vn_lock(vp, flags | LK_INTERLOCK)) != 0) {
vdrop(vp);
CTR2(KTR_VFS, "%s: impossible to lock vnode %p", __func__,
vp);
return (error);
}
2244:
*/
vp->v_iflag |= VI_OWEINACT;
switch (func) {
case VPUTX_VRELE:
error = vn_lock(vp, LK_EXCLUSIVE | LK_INTERLOCK);
VI_LOCK(vp);
break;
case VPUTX_VPUT:
if (VOP_ISLOCKED(vp) != LK_EXCLUSIVE) {
error = VOP_LOCK(vp, LK_UPGRADE | LK_INTERLOCK |
Deadlock is observed per FS: other ZFS filesystems within ZFS pool that contains deadlocked FS, are working OK.
>How-To-Repeat:
Create ZFS fs with tons (300k) of RRD files and update them periodically
>Fix:
It seems that setting kern.maxvnodes to higher value may cope with this hang, at least 100.chksetuid can be executed without triggering deadlock.
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list