[Bug 202801] panic: spin lock held too long, maybe a pipe issue
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Mon Aug 31 22:28:25 UTC 2015
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=202801
Bug ID: 202801
Summary: panic: spin lock held too long, maybe a pipe issue
Product: Base System
Version: 8.4-STABLE
Hardware: amd64
OS: Any
Status: New
Severity: Affects Only Me
Priority: ---
Component: kern
Assignee: freebsd-bugs at FreeBSD.org
Reporter: longwitz at incore.de
On a system running FreeBSD 8.4-STABLE r284383 (amd64) the system panics during
amanda backup.
The processes amandad (pid 11806,tid 101757, holds spin lock) and sendbackup
(pid 11809, tid 100537, panics) are involved.
(kgdb) ps
pid ppid pgrp uid state wmesg wchan cmd
12286 2867 30 0 S nanslp 0xffffffff808ab788 sleep
11818 11815 11806 990 S pause 0xffffff00bbcb0980 dump
11817 11815 11806 990 S pause 0xffffff00150ab0a0 dump
11816 11815 11806 990 S pipdwt 0xffffff03135e02d8 dump
11815 11810 11806 990 S sbwait 0xffffff035bc233ec dump
11810 11807 11806 990 S wait 0xffffff01087e08e0 dump
11809 11807 11806 990 R CPU 2 sendbackup
11807 11806 11806 990 S piperd 0xffffff0015100000 initial thread
11806 5831 11806 990 Ss pipelk 0xffffff03b40972d8 initial thread
7761 7751 7617 984 S select 0xffffff03d63985c0 initial thread
7756 7737 7607 984 S select 0xffffff02b1cee140 initial thread
..........
100000 68 Run CPU 5 [swapper]
11811 11809 11806 990 Z sh
11808 11806 11806 990 Z amandad
(kgdb) info threads
507 Thread 101855 (PID=12286: sleep) sched_switch (td=0xffffff00b8d4c8e0,
newtd=0xffffff0003567000, flags=<value optimized out>)
at /usr/src/sys/kern/sched_ule.c:1932
506 Thread 101216 (PID=11818: dump) sched_switch (td=0xffffff00bbceb000,
newtd=0xffffff000355c8e0, flags=<value optimized out>)
at /usr/src/sys/kern/sched_ule.c:1932
505 Thread 100132 (PID=11817: dump) sched_switch (td=0xffffff00150c5000,
newtd=0xffffff000355c470, flags=<value optimized out>)
at /usr/src/sys/kern/sched_ule.c:1932
504 Thread 101906 (PID=11816: dump) sched_switch (td=0xffffff033b5fc470,
newtd=0xffffff00035678e0, flags=<value optimized out>)
at /usr/src/sys/kern/sched_ule.c:1932
503 Thread 100957 (PID=11815: dump) sched_switch (td=0xffffff03a1c7f8e0,
newtd=0xffffff000355c470, flags=<value optimized out>)
at /usr/src/sys/kern/sched_ule.c:1932
502 Thread 101391 (PID=11810: dump) sched_switch (td=0xffffff02a0f86000,
newtd=0xffffff000355b000, flags=<value optimized out>)
at /usr/src/sys/kern/sched_ule.c:1932
* 501 Thread 100537 (PID=11809: sendbackup) doadump () at
/usr/src/sys/kern/kern_shutdown.c:266
500 Thread 101807 (PID=11807: sendbackup/initial thread) sched_switch
(td=0xffffff005d1d4470, newtd=0xffffff000355b470, flags=<value optimized out>)
at /usr/src/sys/kern/sched_ule.c:1932
499 Thread 101757 (PID=11806: amandad/initial thread) sched_switch
(td=0xffffff024153b8e0, newtd=0xffffff0003567470, flags=<value optimized out>)
at /usr/src/sys/kern/sched_ule.c:1932
Unread portion of the kernel message buffer:
spin lock 0xffffffff808cccd8 (sleepq chain) held by 0xffffff024153b8e0 (tid
101757) too long
panic: spin lock held too long
cpuid = 2
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
panic() at panic+0x1ce
_mtx_lock_spin_failed() at _mtx_lock_spin_failed+0x39
_mtx_lock_spin() at _mtx_lock_spin+0xc3
wakeup() at wakeup+0x19
pipe_write() at pipe_write+0x9a9
dofilewrite() at dofilewrite+0x8b
kern_writev() at kern_writev+0x60
write() at write+0x55
amd64_syscall() at amd64_syscall+0x24f
Xfast_syscall() at Xfast_syscall+0xfc
--- syscall (4, FreeBSD ELF64, write), rip = 0x801e7287c, rsp = 0x7fffffffe448,
rbp = 0x7fffffffe470 ---
KDB: enter: panic
0xffffff0015b74b10: tag ufs, type VREG
usecount 1, writecount 0, refcount 3 mountedhere 0
flags ()
v_object 0xffffff00ba0d1510 ref 0 pages 1
lock type ufs: EXCL by thread 0xffffff0003cca8e0 (pid 8)
ino 329813, on dev mirror/gmsv02p8.journal
Dumping 9635 out of 16365 MB:..1%..11%..21%..31%..41%.
#0 doadump () at /usr/src/sys/kern/kern_shutdown.c:266
#1 0xffffffff801fdfcc in db_fncall (dummy1=<value optimized out>,
dummy2=<value optimized out>, dummy3=<value optimized out>, dummy4=<value
optimized
out
>)
at /usr/src/sys/ddb/db_command.c:548
#2 0xffffffff801fe27d in db_command (last_cmdp=0xffffffff80879dc0,
cmd_table=<value optimized out>, dopager=0) at
/usr/src/sys/ddb/db_command.c:445
#3 0xffffffff80202933 in db_script_exec (scriptname=0xffffff84862da410
"kdb.enter.panic", warnifnotfound=0) at /usr/src/sys/ddb/db_script.c:302
#4 0xffffffff80202a02 in db_script_kdbenter (eventname=<value optimized out>)
at /usr/src/sys/ddb/db_script.c:324
#5 0xffffffff80200614 in db_trap (type=<value optimized out>, code=<value
optimized out>) at /usr/src/sys/ddb/db_main.c:230
#6 0xffffffff80440f21 in kdb_trap (type=3, code=0, tf=0xffffff84862da6f0) at
/usr/src/sys/kern/subr_kdb.c:654
#7 0xffffffff805f5ba6 in trap (frame=0xffffff84862da6f0) at
/usr/src/sys/amd64/amd64/trap.c:574
#8 0xffffffff805db808 in calltrap () at
/usr/src/sys/amd64/amd64/exception.S:232
#9 0xffffffff804409bb in kdb_enter (why=0xffffffff80695a0a "panic", msg=0xa
<Address 0xa out of bounds>) at cpufunc.h:63
#10 0xffffffff8040aa67 in panic (fmt=<value optimized out>) at
/usr/src/sys/kern/kern_shutdown.c:616
#11 0xffffffff803f7f09 in _mtx_lock_spin_failed (m=<value optimized out>) at
/usr/src/sys/kern/kern_mutex.c:505
#12 0xffffffff803f7fd3 in _mtx_lock_spin (m=0xffffffff808cccd8,
tid=18446742977329681520, opts=<value optimized out>, file=<value optimized
out>,
line=<value optimized out>) at /usr/src/sys/kern/kern_mutex.c:544
#13 0xffffffff80414719 in wakeup (ident=0xffffff03b40972d8) at
/usr/src/sys/kern/kern_synch.c:365
#14 0xffffffff80458029 in pipe_write (fp=0xffffff00bfa614b0,
uio=0xffffff84862daad0, active_cred=<value optimized out>, flags=<value
optimized out>,
td=<value optimized out>) at /usr/src/sys/kern/sys_pipe.c:517
#15 0xffffffff80453cbb in dofilewrite (td=0xffffff00baaad470, fd=3,
fp=0xffffff00bfa614b0, auio=0xffffff84862daad0, offset=<value optimized out>,
flags
=0)
at file.h:254
#16 0xffffffff80453fd0 in kern_writev (td=0xffffff00baaad470, fd=3,
auio=0xffffff84862daad0) at /usr/src/sys/kern/sys_generic.c:447
#17 0xffffffff80454055 in write (td=<value optimized out>, uap=<value optimized
out>) at /usr/src/sys/kern/sys_generic.c:363
#18 0xffffffff805f483f in amd64_syscall (td=0xffffff00baaad470, traced=0) at
subr_syscall.c:114
#19 0xffffffff805dbafc in Xfast_syscall () at
/usr/src/sys/amd64/amd64/exception.S:392
#20 0x0000000801e7287c in ?? ()
kgdb) f 12
#12 0xffffffff803f7fd3 in _mtx_lock_spin (m=0xffffffff808cccd8,
tid=18446742977329681520, opts=<value optimized out>, file=<value optimized
out>,
line=<value optimized out>) at /usr/src/sys/kern/kern_mutex.c:544
544 _mtx_lock_spin_failed(m);
(kgdb) info loc
i = 60000000 <---- indicates too long spin
(kgdb) p m
$8 = (struct mtx *) 0xffffffff808cccd8
(kgdb) mtx_owner m
td: 0xffffff024153b8e0
pid: 11806, p_comm: amandad
(kgdb) p *m
$12 = {lock_object = {lo_name = 0xffffffff8069a4d1 "sleepq chain", lo_flags =
720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 18446742983883864288}
(kgdb) x/x m->mtx_lock
0xffffff024153b8e0: 0x808cc3c8
kgdb) tid 101757
[Switching to thread 499 (Thread 101757)]#0 sched_switch
(td=0xffffff024153b8e0, newtd=0xffffff0003567470, flags=<value optimized out>)
(kgdb) where
#0 sched_switch (td=0xffffff024153b8e0, newtd=0xffffff0003567470, flags=<value
optimized out>) at /usr/src/sys/kern/sched_ule.c:1932
#1 0xffffffff804142a6 in mi_switch (flags=260, newtd=0x0) at
/usr/src/sys/kern/kern_synch.c:466
#2 0xffffffff8044be92 in sleepq_catch_signals (wchan=0xffffff024153b8e0,
pri=76) at /usr/src/sys/kern/subr_sleepqueue.c:421
#3 0xffffffff8044c9d2 in sleepq_wait_sig (wchan=<value optimized out>,
pri=<value optimized out>) at /usr/src/sys/kern/subr_sleepqueue.c:627
#4 0xffffffff80414a0d in _sleep (ident=0xffffff024153b8e0, lock=0x100,
priority=<value optimized out>, wmesg=0xffffffff8069ac67 "pipelk", timo=0)
at /usr/src/sys/kern/kern_synch.c:248
#5 0xffffffff8045704d in pipe_read (fp=0xffffff00d85804b0,
uio=0xffffff8487aaead0, active_cred=<value optimized out>, flags=<value
optimized out>,
td=<value optimized out>) at /usr/src/sys/kern/sys_pipe.c:493
#6 0xffffffff80454101 in dofileread (td=0xffffff024153b8e0, fd=7,
fp=0xffffff00d85804b0, auio=0xffffff8487aaead0, offset=<value optimized out>,
flags=
0)
at file.h:242
#7 0xffffffff80454490 in kern_readv (td=0xffffff024153b8e0, fd=7,
auio=0xffffff8487aaead0) at /usr/src/sys/kern/sys_generic.c:238
#8 0xffffffff80454585 in read (td=<value optimized out>, uap=<value optimized
out>) at /usr/src/sys/kern/sys_generic.c:154
#9 0xffffffff805f483f in amd64_syscall (td=0xffffff024153b8e0, traced=0) at
subr_syscall.c:114
#10 0xffffffff805dbafc in Xfast_syscall () at
/usr/src/sys/amd64/amd64/exception.S:392
#11 0x0000000801d6289c in ?? ()
kgdb) f 4
#4 0xffffffff80414a0d in _sleep (ident=0xffffff024153b8e0, lock=0x100,
priority=<value optimized out>, wmesg=0xffffffff8069ac67 "pipelk", timo=0)
at /usr/src/sys/kern/kern_synch.c:248
248 rval = sleepq_wait_sig(ident, pri);
(kgdb) list
243 if (timo && catch)
244 rval = sleepq_timedwait_sig(ident, pri);
245 else if (timo)
246 rval = sleepq_timedwait(ident, pri);
247 else if (catch)
248 rval = sleepq_wait_sig(ident, pri);
249 else {
250 sleepq_wait(ident, pri);
251 rval = 0;
252 }
(kgdb) p catch
$26 = 256
(kgdb) p timo
$27 = 0
kgdb) f 5
#5 0xffffffff8045704d in pipe_read (fp=0xffffff00d85804b0,
uio=0xffffff8487aaead0, active_cred=<value optimized out>, flags=<value
optimized out>,
td=<value optimized out>) at /usr/src/sys/kern/sys_pipe.c:493
493 error = msleep(cpipe, PIPE_MTX(cpipe),
(kgdb) list
488 int error;
489
490 PIPE_LOCK_ASSERT(cpipe, MA_OWNED);
491 while (cpipe->pipe_state & PIPE_LOCKFL) {
492 cpipe->pipe_state |= PIPE_LWANT;
493 error = msleep(cpipe, PIPE_MTX(cpipe),
494 catch ? (PRIBIO | PCATCH) : PRIBIO,
495 "pipelk", 0);
496 if (error != 0)
497 return (error);
(kgdb) p *(struct pipe *)0xffffff024153b8e0
$28 = {pipe_buffer = {cnt = 2156708808, in = 4294967295, out = 1147716720, size
= 4294967042, buffer = 0x0}, pipe_map = {cnt = 18446742983935575168, po
s = 0, npages = -2138352248, ms = {0x0, 0xffffff0149d19e00, 0x0,
0xffffff84877254f0, 0xffffff000354fdc8, 0xffffff00151f0280, 0x0,
0xffffff00150586c0, 0
xffffff02cb71bd00, 0x18d7d, 0x0, 0x0, 0x0, 0x0, 0x0, 0xffffff024153b980,
0xffffff024468c470}}, pipe_sel = {si_tdlist = {tqh_first = 0x1, tqh_last = 0x2
0000000c}, si_note = {kl_list = {slh_first = 0x0}, kl_lock = 0, kl_unlock =
0xffffff03b40972d8, kl_assert_locked = 0xffffffff8069ac67 <select_flags+23>
, kl_assert_unlocked = 0x5f0000ff04, kl_lockarg = 0x0}, si_mtx = 0x0},
pipe_atime = {tv_sec = 0, tv_nsec = 0}, pipe_mtime = {tv_sec = 0, tv_nsec = 0},
pipe_ctime = {tv_sec = -1096019876864, tv_nsec = 8189724157764698112},
pipe_sigio = 0xd400000000 , pipe_peer = 0x0, pipe_pair = 0x0, pipe_state = 0,
pi
pe_busy = 0, pipe_present = 0}
(kgdb) p *(struct lock_object*)0x100
Cannot access memory at address 0x100
That looks suspicious, further line 493 in sys_pipe.c uses PIPE_MTX:
#define PIPE_MTX(pipe) (&(pipe)->pipe_pair->pp_mtx)
but we saw pipe_pair=0.
I can give more information from the kernel dump.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list