panic: _mtx_lock_sleep: recursed on non-recursive mutex process
lock @ /usr/src/sys/kern/kern_exit.c:912
Kostik Belousov
kostikbel at gmail.com
Sun Apr 10 15:18:54 UTC 2011
On Sun, Apr 10, 2011 at 03:37:59PM +0200, Fabian Keil wrote:
> The following panic seems to be reliably reproducible with sources
> from yesterday (and today) by running claws-mail in X, switching to
> the console and trying to attach gdb to it with 'gdb -p $(pgrep claws-mail)'.
>
> gdb somehow fails to attach, and after leaving gdb the panic occurs.
>
> fk at r500 /usr/crash $cat info.0
> Dump header from device /dev/ada0s1b
> Architecture: amd64
> Architecture Version: 2
> Dump Length: 347676672B (331 MB)
> Blocksize: 512
> Dumptime: Sat Apr 9 22:48:15 2011
> Hostname: r500.local
> Magic: FreeBSD Kernel Dump
> Version String: FreeBSD 9.0-CURRENT #284 r+1c500d9: Sat Apr 9 15:44:51 CEST 2011
> fk at r500.local:/usr/obj/usr/src/sys/ZOEY
> Panic String: _mtx_lock_sleep: recursed on non-recursive mutex process lock @ /usr/src/sys/kern/kern_exit.c:912
>
> Dump Parity: 1557143341
> Bounds: 0
> Dump Status: good
>
> fk at r500 /usr/crash $kgdb /boot/kernel/kernel vmcore.0
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd"...
>
> Unread portion of the kernel message buffer:
> panic: _mtx_lock_sleep: recursed on non-recursive mutex process lock @ /usr/src/sys/kern/kern_exit.c:912
>
> cpuid = 0
> KDB: enter: panic
> panic: from debugger
> cpuid = 0
> Uptime: 3m48s
> Physical memory: 1974 MB
> Dumping 331 MB: 316 300 284 268 252 236 220 204 188 172 156 140 124 108 92 76 60 44 28 12
>
> Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kernel/zfs.ko.symbols...done.
> done.
> [...]
> Loaded symbols for /boot/kernel/drm.ko
> #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:250
> 250 if (textdump_pending)
> (kgdb) where
> #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:250
> #1 0xffffffff80531fb7 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418
> #2 0xffffffff80531a51 in panic (fmt=Variable "fmt" is not available.
> ) at /usr/src/sys/kern/kern_shutdown.c:591
> #3 0xffffffff803202b7 in db_panic (addr=Variable "addr" is not available.
> ) at /usr/src/sys/ddb/db_command.c:478
> #4 0xffffffff80320761 in db_command (last_cmdp=0xffffffff80c9fb60, cmd_table=Variable "cmd_table" is not available.
> ) at /usr/src/sys/ddb/db_command.c:445
> #5 0xffffffff803209b0 in db_command_loop () at /usr/src/sys/ddb/db_command.c:498
> #6 0xffffffff80322a99 in db_trap (type=Variable "type" is not available.
> ) at /usr/src/sys/ddb/db_main.c:229
> #7 0xffffffff80567ba1 in kdb_trap (type=3, code=0, tf=0xffffff8000017740) at /usr/src/sys/kern/subr_kdb.c:533
> #8 0xffffffff807cee06 in trap (frame=0xffffff8000017740) at /usr/src/sys/amd64/amd64/trap.c:590
> #9 0xffffffff807b9303 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:228
> #10 0xffffffff805679fb in kdb_enter (why=0xffffffff80955faf "panic", msg=0xa <Address 0xa out of bounds>) at cpufunc.h:63
> #11 0xffffffff80531a60 in panic (fmt=Variable "fmt" is not available.
> ) at /usr/src/sys/kern/kern_shutdown.c:574
> #12 0xffffffff805222f9 in _mtx_lock_sleep (m=Variable "m" is not available.
> ) at /usr/src/sys/kern/kern_mutex.c:341
> #13 0xffffffff805223a5 in _mtx_lock_flags (m=Variable "m" is not available.
> ) at /usr/src/sys/kern/kern_mutex.c:203
> #14 0xffffffff80503789 in proc_reparent (child=0xfffffe001290a000, parent=0xfffffe0012e48900) at /usr/src/sys/kern/kern_exit.c:912
> #15 0xffffffff80503ed8 in kern_wait (td=0xffffff8000017adc, pid=Variable "pid" is not available.
> ) at /usr/src/sys/kern/kern_exit.c:708
> #16 0xffffffff805043e5 in wait4 (td=Variable "td" is not available.
> ) at /usr/src/sys/kern/kern_exit.c:653
> #17 0xffffffff805751ab in syscallenter (td=0xfffffe00028838c0, sa=0xffffff8000017bb0) at /usr/src/sys/kern/subr_trap.c:344
> #18 0xffffffff807ce8bc in syscall (frame=0xffffff8000017c50) at /usr/src/sys/amd64/amd64/trap.c:910
> #19 0xffffffff807b95cd in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:384
> #20 0x000000000040cb8c in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> (kgdb) f 14
> #14 0xffffffff80503789 in proc_reparent (child=0xfffffe001290a000, parent=0xfffffe0012e48900) at /usr/src/sys/kern/kern_exit.c:912
> 912 PROC_LOCK(parent);
> (kgdb) p *child
> $2 = {p_list = {le_next = 0x0, le_prev = 0xffffffff80cba328}, p_threads = {tqh_first = 0xfffffe001287c000, tqh_last = 0xfffffe001287c010}, p_slock = {lock_object = {
> lo_name = 0xffffffff80954527 "process slock", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, p_ucred = 0xfffffe00427b5a00, p_fd = 0x0,
> p_fdtol = 0x0, p_stats = 0xfffffe0005d82800, p_limit = 0x0, p_limco = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0,
> c_func = 0, c_lock = 0xfffffe001290a0f8, c_flags = 0, c_cpu = 0}, p_sigacts = 0xfffffe0042634000, p_flag = 268460034, p_state = PRS_ZOMBIE, p_pid = 3515, p_hash = {
> le_next = 0x0, le_prev = 0xffffff8000222dd8}, p_pglist = {le_next = 0x0, le_prev = 0xfffffe004a876790}, p_pptr = 0xfffffe0002881900, p_sibling = {
> le_next = 0xfffffe0042640900, le_prev = 0xfffffe00028819f0}, p_children = {lh_first = 0x0}, p_mtx = {lock_object = {lo_name = 0xffffffff8095451a "process lock",
> lo_flags = 21168128, lo_data = 0, lo_witness = 0x0}, mtx_lock = 18446741874728777920}, p_ksi = 0xfffffe0005c60150, p_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}},
> sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xfffffe001290a140}, sq_proc = 0xfffffe001290a000, sq_flags = 1}, p_oppid = 0,
> p_vmspace = 0xffffffff80cb7760, p_swtick = 21105, p_realtimer = {it_interval = {tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}}, p_ru = {ru_utime = {
> tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 21044, ru_ixrss = 544640, ru_idrss = 334244, ru_isrss = 27264, ru_minflt = 2745,
> ru_majflt = 2158, ru_nswap = 0, ru_inblock = 3, ru_oublock = 0, ru_msgsnd = 134, ru_msgrcv = 103, ru_nsignals = 0, ru_nvcsw = 473, ru_nivcsw = 533}, p_rux = {
> rux_runtime = 2346291660, rux_uticks = 119, rux_sticks = 29, rux_iticks = 0, rux_uu = 948699, rux_su = 223223, rux_tu = 1171923}, p_crux = {rux_runtime = 0,
> rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, p_profthreads = 0, p_exitthreads = 0, p_traceflag = 0, p_tracevp = 0x0,
> p_tracecred = 0x0, p_textvp = 0x0, p_lock = 0, p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_sig = 0, p_code = 0, p_stops = 0, p_stype = 0, p_step = 0 '\0',
> p_pfsflags = 0 '\0', p_nlminfo = 0x0, p_aioinfo = 0x0, p_singlethread = 0x0, p_suspcount = 0, p_xthread = 0xfffffe001287c000, p_boundary_count = 0, p_pendingcnt = 0,
> p_itimers = 0x0, p_magic = 3203398350, p_osrel = 900034, p_comm = "claws-mail\000\000\000\000\000\000\000\000\000", p_pgrp = 0xfffffe004a876780,
> p_sysent = 0xffffffff80c6b320, p_args = 0xfffffe0005c8b3c0, p_cpulimit = 9223372036854775807, p_nice = 0 '\0', p_fibnum = 0, p_xstat = 9, p_klist = {kl_list = {
> slh_first = 0x0}, kl_lock = 0, kl_unlock = 0, kl_assert_locked = 0xffffffff804fd960 <knlist_mtx_assert_locked>,
> kl_assert_unlocked = 0xffffffff804fd940 <knlist_mtx_assert_unlocked>, kl_lockarg = 0x0}, p_numthreads = 1, p_md = {md_ldt = 0x0, md_ldt_sd = {sd_lolimit = 0,
> sd_lobase = 0, sd_type = 0, sd_dpl = 0, sd_p = 0, sd_hilimit = 0, sd_xx0 = 0, sd_gran = 0, sd_hibase = 0, sd_xx1 = 0, sd_mbz = 0, sd_xx2 = 0}}, p_itcallout = {
> c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 16, c_cpu = 0}, p_acflag = 16,
> p_peers = 0x0, p_leader = 0xfffffe001290a000, p_emuldata = 0x0, p_label = 0x0, p_sched = 0xfffffe001290a480, p_ktr = {stqh_first = 0x0, stqh_last = 0xfffffe001290a430},
> p_mqnotifier = {lh_first = 0x0}, p_dtrace = 0xfffffe0005c7a740, p_pwait = {cv_description = 0xffffffff80954fbe "ppwait", cv_waiters = 0}, p_dbgwait = {
> cv_description = 0xffffffff80954fc5 "dbgwait", cv_waiters = 0}, p_prev_runtime = 0, p_racct = 0x0}
> (kgdb) p *parent
> $3 = {p_list = {le_next = 0xfffffe004261b000, le_prev = 0xffffffff80cba320}, p_threads = {tqh_first = 0xfffffe0005e0d8c0, tqh_last = 0xfffffe0005e0d8d0}, p_slock = {
> lock_object = {lo_name = 0xffffffff80954527 "process slock", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, p_ucred = 0xfffffe00427b5a00,
> p_fd = 0xfffffe00128b1c00, p_fdtol = 0x0, p_stats = 0xfffffe001290c800, p_limit = 0xfffffe00427b5300, p_limco = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0,
> tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_lock = 0xfffffe0012e489f8, c_flags = 0, c_cpu = 0}, p_sigacts = 0xfffffe0012e9c000, p_flag = 268451842,
> p_state = PRS_NORMAL, p_pid = 3407, p_hash = {le_next = 0x0, le_prev = 0xffffff8000222a78}, p_pglist = {le_next = 0x0, le_prev = 0xfffffe0005e59310},
> p_pptr = 0xfffffe004261b000, p_sibling = {le_next = 0x0, le_prev = 0xfffffe004261b0f0}, p_children = {lh_first = 0x0}, p_mtx = {lock_object = {
> lo_name = 0xffffffff8095451a "process lock", lo_flags = 21168128, lo_data = 0, lo_witness = 0x0}, mtx_lock = 18446741874728777920}, p_ksi = 0xfffffe0005c5eb60,
> p_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xfffffe0012e48a40},
> sq_proc = 0xfffffe0012e48900, sq_flags = 1}, p_oppid = 0, p_vmspace = 0xfffffe00128c5310, p_swtick = 12384, p_realtimer = {it_interval = {tv_sec = 0, tv_usec = 0},
> it_value = {tv_sec = 0, tv_usec = 0}}, p_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0,
> ru_isrss = 0, ru_minflt = 0, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 0, ru_nivcsw = 0},
> p_rux = {rux_runtime = 1633285860, rux_uticks = 63, rux_sticks = 44, rux_iticks = 0, rux_uu = 482021, rux_su = 336649, rux_tu = 818671}, p_crux = {rux_runtime = 770335510,
> rux_uticks = 11, rux_sticks = 39, rux_iticks = 0, rux_uu = 84947, rux_su = 301176, rux_tu = 386124}, p_profthreads = 0, p_exitthreads = 0, p_traceflag = 0,
> p_tracevp = 0x0, p_tracecred = 0x0, p_textvp = 0xfffffe00427ae780, p_lock = 0, p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_sig = 0, p_code = 0, p_stops = 0,
> p_stype = 0, p_step = 0 '\0', p_pfsflags = 0 '\0', p_nlminfo = 0x0, p_aioinfo = 0x0, p_singlethread = 0x0, p_suspcount = 0, p_xthread = 0x0, p_boundary_count = 0,
> p_pendingcnt = 0, p_itimers = 0x0, p_magic = 3203398350, p_osrel = 900025, p_comm = "bash", '\0' <repeats 15 times>, p_pgrp = 0xfffffe0005e59300,
> p_sysent = 0xffffffff80c6b320, p_args = 0xfffffe0002ba2880, p_cpulimit = 9223372036854775807, p_nice = 0 '\0', p_fibnum = 0, p_xstat = 0, p_klist = {kl_list = {
> slh_first = 0x0}, kl_lock = 0xffffffff804fd660 <knlist_mtx_lock>, kl_unlock = 0xffffffff804fd680 <knlist_mtx_unlock>,
> kl_assert_locked = 0xffffffff804fd960 <knlist_mtx_assert_locked>, kl_assert_unlocked = 0xffffffff804fd940 <knlist_mtx_assert_unlocked>, kl_lockarg = 0xfffffe0012e489f8},
> p_numthreads = 1, p_md = {md_ldt = 0x0, md_ldt_sd = {sd_lolimit = 0, sd_lobase = 0, sd_type = 0, sd_dpl = 0, sd_p = 0, sd_hilimit = 0, sd_xx0 = 0, sd_gran = 0,
> sd_hibase = 0, sd_xx1 = 0, sd_mbz = 0, sd_xx2 = 0}}, p_itcallout = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0,
> c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 16, c_cpu = 0}, p_acflag = 0, p_peers = 0x0, p_leader = 0xfffffe0012e48900, p_emuldata = 0x0, p_label = 0x0,
> p_sched = 0xfffffe0012e48d80, p_ktr = {stqh_first = 0x0, stqh_last = 0xfffffe0012e48d30}, p_mqnotifier = {lh_first = 0x0}, p_dtrace = 0xfffffe0005d85b80, p_pwait = {
> cv_description = 0xffffffff80954fbe "ppwait", cv_waiters = 0}, p_dbgwait = {cv_description = 0xffffffff80954fc5 "dbgwait", cv_waiters = 0}, p_prev_runtime = 0,
> p_racct = 0x0}
>
> I'm not sure if switching to the console first matters.
>
> Before the panic, the states of gdb and claws-mail were:
>
> fk at r500 ~ $procstat -kk $(pgrep claws-mail)
> PID TID COMM TDNAME KSTACK
> 15813 100528 claws-mail initial thread mi_switch+0x21d thread_suspend_switch+0x103 ptracestop+0x7a cursig+0x4af ast+0x198 doreti_ast+0x1f
> fk at r500 ~ $procstat -kk $(pgrep gdb)
> PID TID COMM TDNAME KSTACK
> 15823 100928 gdb - mi_switch+0x21d sleepq_switch+0x123 sleepq_catch_signals+0x2a2 sleepq_wait_sig+0x16 _cv_wait_sig+0x14c tty_wait+0x48 ttydisc_read+0xd4 ttydev_read+0xab devfs_read_f+0x88 dofileread+0xa1 kern_readv+0x60 read+0x55 syscallenter+0x1cb syscall+0x4c Xfast_syscall+0xdd
>
> I believe I also reproduced the problem with a kernel from Tuesday,
> but dumping core from ddb failed so I can't check.
>
> I cannot reproduce the panic with a kernel from 2011-03-12
> which I haven't deleted yet, but didn't try to bisect it further.
>
> gdb can't attach to claws-mail either, but the panic doesn't occur:
>
> fk at r500 ~ $gdb -p $(pgrep claws-mail)
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd".
> Attaching to process 3432
> /usr/src/gnu/usr.bin/gdb/libgdb/../../../../contrib/gdb/gdb/solib-svr4.c:1444: i
> nternal-error: legacy_fetch_link_map_offsets called without legacy link_map supp
> ort enabled.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Quit this debugging session? (y or n) y
>
> /usr/src/gnu/usr.bin/gdb/libgdb/../../../../contrib/gdb/gdb/solib-svr4.c:1444: i
> nternal-error: legacy_fetch_link_map_offsets called without legacy link_map supp
> ort enabled.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Create a core file of GDB? (y or n) n
>
> In related news, the kernel from yesterday also seems to panic/hang/whatever
> when claws-mail segfaults due to an invalid memory access, but as I was running
> X, I got no core dumps for those problems either.
For the first problem, try patch at the end.
Regarding the 'panic/hang/whatever', I cannot reproduce it with a program
that specifically access unmapped memory. You would need to catch core
dump and provide the backtrace.
diff --git a/sys/kern/kern_exit.c b/sys/kern/kern_exit.c
index 5fec31b..2d35ed5 100644
--- a/sys/kern/kern_exit.c
+++ b/sys/kern/kern_exit.c
@@ -903,15 +903,19 @@ loop:
void
proc_reparent(struct proc *child, struct proc *parent)
{
+ int locked;
sx_assert(&proctree_lock, SX_XLOCKED);
PROC_LOCK_ASSERT(child, MA_OWNED);
if (child->p_pptr == parent)
return;
- PROC_LOCK(parent);
+ locked = PROC_LOCKED(parent);
+ if (!locked)
+ PROC_LOCK(parent);
racct_add_force(parent, RACCT_NPROC, 1);
- PROC_UNLOCK(parent);
+ if (!locked)
+ PROC_UNLOCK(parent);
PROC_LOCK(child->p_pptr);
racct_sub(child->p_pptr, RACCT_NPROC, 1);
sigqueue_take(child->p_ksi);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-current/attachments/20110410/693452d7/attachment.pgp
More information about the freebsd-current
mailing list