spin lock sched lock held for > 5 seconds

Fri Aug 8 11:29:42 PDT 2003

On 08-Aug-2003 Lars Eggert wrote:
> John Baldwin wrote:
>> On 01-Aug-2003 Lars Eggert wrote:
>> 
>>>Hi,
>>>
>>>got the following panic overnight running with all debugging options on
>>>(WITNESS, MUTEX_DEBUG, DIAGNOSTIC, INVARIANTS; WITNESS_SKIPSPIN off):
>>>
>>>panic: spin lock sched lock held by 0xc658e130 for > 5 seconds
>>>cpuid = 0; lapic.id = 00000000
>>>Stack backtrace:
>>>backtrace(c031d030,0,c031c4c5,df0dab8c,100) at backtrace+0x17
>>>panic(c031c4c5,c031c62e,c658e130,c036f160,18b) at panic+0x13d
>>>_mtx_lock_spin(c036f160,2,c031a229,18b,c21b2ab0) at _mtx_lock_spin+0x83
>>>_mtx_lock_spin_flags(c036f160,2,c031a229,18b,df0dac0c) at
>>>_mtx_lock_spin_flags+0xb9
>>>statclock(df0dac00,df0dac44,c02d8a9c,0,c2198d00) at statclock+0x39
>>>rtcintr(0) at rtcintr+0x4f
>>>Xfastunpend8(df0dacb8,c02d1f05,8,608,c0372e60) at Xfastunpend8+0x1c
>>>call_fast_unpend(8,608,c0372e60,ffc00034,0) at call_fast_unpend+0xd
>>>i386_unpend(c036f160,c21b0790,df0dacd0,c01b1ae0,df0dacec) at
>>>i386_unpend+0x8d
>>>cpu_unpend(df0dacec,c01a2534,c036f160,1,c031c2e2) at cpu_unpend+0x2d
>>>critical_exit(c036f160,1,c031c2e2,1bc,1) at critical_exit+0x2d
>>>_mtx_unlock_spin_flags(c036f160,0,c031ac9f,7c,c21b2ab0) at
>>>_mtx_unlock_spin_flags+0xbb
>>>idle_proc(0,df0dad48,c031ab89,312,c21b2ab0) at idle_proc+0xb0
>>>fork_exit(c0199972,0,df0dad48) at fork_exit+0xc3
>>>fork_trampoline() at fork_trampoline+0x1a
>>>--- trap 0x1, eip = 0, esp = 0xdf0dad7c, ebp = 0 ---
>>>Debugger("panic")
>>>timeout stopping cpus
>>>Stopped at      Debugger+0x4f:  xchgl   %ebx,in_Debugger.0
>>>db>
>>>
>>>The machine is still in ddb, let me know if I can provide additional info.
>> 
>> Try updating to a more recent current.  I recently added some extra
>> debugging here that will attempt to better show who owns the lock and
>> where it was acquired.
> 
> Same panic string, but a different call chain:
> 
> spin lock sched lock held by 0xc21b2d10 for > 5 seconds
> exclusive spin mutex sched lock r = 0 (0xc036efc0) locked @ 
> /usr/src/sys/kern/kern_mutex.c:512
> panic: spin lock held too long
> cpuid = 3; lapic.id = 03000000
> Stack backtrace:
> backtrace(c031ce60,3000000,c031c305,df0d0c30,100) at backtrace+0x17
> panic(c031c305,c21b2d10,c21b2d10,c036efc0,bc) at panic+0x13d
> _mtx_lock_spin(c036efc0,2,c031a037,bc,c21b2720) at _mtx_lock_spin+0xb4
> _mtx_lock_spin_flags(c036efc0,2,c031a037,bc,c21b2720) at 
> _mtx_lock_spin_flags+0xb9
> hardclock_process(df0d0ca0,df0d0ce4,c02d7062,0,c0390018) at 
> hardclock_process+0x3c
> forwarded_hardclock(0) at forwarded_hardclock+0x11
> Xhardclock(df0d0d10,c01998d7,c036f000,2,c031aaad) at Xhardclock+0x52
> cpu_idle(c036f000,2,c031aaad,5f,c21b2720) at cpu_idle+0x8
> idle_proc(0,df0d0d48,c031a997,30e,0) at idle_proc+0x45
> fork_exit(c0199892,0,df0d0d48) at fork_exit+0xc3
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0x1, eip = 0, esp = 0xdf0d0d7c, ebp = 0 ---
> Debugger("panic")
> timeout stopping cpus
> Stopped at      Debugger+0x4f:  xchgl   %ebx,in_Debugger.0
> db>
> 
> Machine is still in ddb, in case you'd like me to poke around some more.

do 'show pcpu' for each CPU.  If this is a dual machine, then it might
be rather easy to figure out which CPU you are on ('show pcpu') then do
a 'show pcpu x' of the other CPU, find the it's curthread, and do a
'trace XXX' on it's PID to see where it is at.  That might not work but
it's worth a shot I guess.  Also, you can try 'option ADAPTIVE_MUTEXES'
and see if that helps.

-- 

John Baldwin <jhb at FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/