ZFS : panic("sleeping thread")
Thomas Backman
serenity at exscape.org
Thu Jun 18 11:50:08 UTC 2009
On May 27, 2009, at 07:58 PM, Artem Belevich wrote:
> Hi,
>
> While recent ZFS improvements got rid of random hangs I used to see,
> there's still one problem that I keep running into -- panic in ZFS
> under heavy load. I can reproduce it by doing a build with -j16 in a
> jail running i386 binaries on -CURRENT/amd64 running on a box with
> quad-core CPU. It takes a while to reproduce, but it usually shows up
> within couple of hours.
>
> Sleeping thread (tid 100606, pid 32147) owns a non-sleepable lock
> sched_switch() at sched_switch+0xed
> mi_switch() at mi_switch+0x16f
> sleepq_wait() at sleepq_wait+0x42
> _sx_xlock_hard() at _sx_xlock_hard+0x1f0
> _sx_xlock() at _sx_xlock+0x4e
> rrw_exit() at rrw_exit+0x1d
> zfs_freebsd_getattr() at zfs_freebsd_getattr+0x2be
> VOP_GETATTR_APV() at VOP_GETATTR_APV+0x44
> filt_vfsread() at filt_vfsread+0x51
> knote() at knote+0xc2
> VOP_WRITE_APV() at VOP_WRITE_APV+0x11f
> vn_write() at vn_write+0x279
> dofilewrite() at dofilewrite+0x85
> kern_writev() at kern_writev+0x60
> write() at write+0x54
> ia32_syscall() at ia32_syscall+0x236
> Xint0x80_syscall() at Xint0x80_syscall+0x85
> --- syscall (4, FreeBSD ELF32, write), rip = 0x78162153, rsp =
> 0xffff945c, rbp = 0xffff9478 ---
>
> It appears that locking within ZFS conflicts with vnode locking. The
> back-trace is always the same.
>
> For now, I've applied following patch to disable the panic, but it
> would be good if someone familiar with VFS locking in FreeBSD could
> take a look.
> If you need any additional info, let me know.
>
> Thanks,
> --Artem
>
> diff -r 930d975c8103 src/sys/kern/subr_turnstile.c
> --- a/sys/kern/subr_turnstile.c Fri Dec 05 16:12:43 2008 -0800
> +++ b/sys/kern/subr_turnstile.c Fri Dec 12 14:31:16 2008 -0800
> @@ -219,7 +219,10 @@
> #ifdef DDB
> db_trace_thread(td, -1);
> #endif
> - panic("sleeping thread");
> + /* Don't propagate priority to a sleeping
> thread. */
> + thread_unlock(td);
> + return;
> + // panic("sleeping thread");
> }
>
> /*
Anyone have any updates on this? I just got a "sleeping thread" panic
in ZFS after doing a zfs rollback. Unfortunately, "panic" in the
debugger resulted in "dump device too small" (despite being RAM-sized)
so I don't have a BT... However the BT I got in the debugger was *not*
the same as yours. There was no _sx_xlock in it, but that's pretty
much all I know about it. :(
Regards,
Thomas
More information about the freebsd-current
mailing list