panic: spin lock held too long (RELENG_8 from today)

Jeremy Chadwick freebsd at jdc.parodius.com
Thu Jul 7 11:41:24 UTC 2011


On Thu, Jul 07, 2011 at 07:32:41AM -0400, Mike Tancsa wrote:
> On 7/7/2011 4:20 AM, Kostik Belousov wrote:
> > 
> > BTW, we had a similar panic, "spinlock held too long", the spinlock
> > is the sched lock N, on busy 8-core box recently upgraded to the
> > stable/8. Unfortunately, machine hung dumping core, so the stack trace
> > for the owner thread was not available.
> > 
> > I was unable to make any conclusion from the data that was present.
> > If the situation is reproducable, you coulld try to revert r221937. This
> > is pure speculation, though.
> 
> Another crash just now after 5hrs uptime. I will try and revert r221937
> unless there is any extra debugging you want me to add to the kernel
> instead  ?
> 
> This is an inbound mail server so a little disruption is possible
> 
>  kgdb /usr/obj/usr/src/sys/recycle/kernel.debug vmcore.13
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "i386-marcel-freebsd"...
> 
> Unread portion of the kernel message buffer:
> spin lock 0xc0b1d200 (sched lock 1) held by 0xc5dac2e0 (tid 100109) too long
> panic: spin lock held too long
> cpuid = 0
> Uptime: 5h37m43s
> Physical memory: 2035 MB
> Dumping 260 MB: 245 229 213 197 181 165 149 133 117 101 85 69 53 37 21 5
> 
> Reading symbols from /boot/kernel/amdsbwd.ko...Reading symbols from
> /boot/kernel/amdsbwd.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/amdsbwd.ko
> #0  doadump () at pcpu.h:231
> 231     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) bt
> #0  doadump () at pcpu.h:231
> #1  0xc06fd6d3 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:429
> #2  0xc06fd937 in panic (fmt=Variable "fmt" is not available.
> ) at /usr/src/sys/kern/kern_shutdown.c:602
> #3  0xc06ed95f in _mtx_lock_spin_failed (m=0x0) at
> /usr/src/sys/kern/kern_mutex.c:490
> #4  0xc06ed9e5 in _mtx_lock_spin (m=0xc0b1d200, tid=3312388992, opts=0,
> file=0x0, line=0)
>     at /usr/src/sys/kern/kern_mutex.c:526
> #5  0xc0720254 in sched_add (td=0xc61892e0, flags=0) at
> /usr/src/sys/kern/sched_ule.c:1119
> #6  0xc07203f9 in sched_wakeup (td=0xc61892e0) at
> /usr/src/sys/kern/sched_ule.c:1950
> #7  0xc07061f8 in setrunnable (td=0xc61892e0) at
> /usr/src/sys/kern/kern_synch.c:499
> #8  0xc07362af in sleepq_resume_thread (sq=0xc55311c0, td=0xc61892e0,
> pri=Variable "pri" is not available.
> )
>     at /usr/src/sys/kern/subr_sleepqueue.c:751
> #9  0xc0736e18 in sleepq_signal (wchan=0xc60386d0, flags=1, pri=0, queue=0)
>     at /usr/src/sys/kern/subr_sleepqueue.c:825
> #10 0xc06b6764 in cv_signal (cvp=0xc60386d0) at
> /usr/src/sys/kern/kern_condvar.c:422
> #11 0xc08eaa0d in xprt_assignthread (xprt=Variable "xprt" is not available.
> ) at /usr/src/sys/rpc/svc.c:342
> #12 0xc08ec502 in xprt_active (xprt=0xc5db8a00) at
> /usr/src/sys/rpc/svc.c:378
> #13 0xc08ee051 in svc_vc_soupcall (so=0xc618a19c, arg=0xc5db8a00,
> waitflag=1) at /usr/src/sys/rpc/svc_vc.c:747
> #14 0xc075bbb1 in sowakeup (so=0xc618a19c, sb=0xc618a1f0) at
> /usr/src/sys/kern/uipc_sockbuf.c:191
> #15 0xc08447bc in tcp_do_segment (m=0xc6567a00, th=0xc6785824,
> so=0xc618a19c, tp=0xc617e000, drop_hdrlen=52,
>     tlen=1448, iptos=0 '\0', ti_locked=2) at
> /usr/src/sys/netinet/tcp_input.c:1775
> #16 0xc0847930 in tcp_input (m=0xc6567a00, off0=20) at
> /usr/src/sys/netinet/tcp_input.c:1329
> #17 0xc07ddaf7 in ip_input (m=0xc6567a00) at
> /usr/src/sys/netinet/ip_input.c:787
> #18 0xc07b8859 in netisr_dispatch_src (proto=1, source=0, m=0xc6567a00)
> at /usr/src/sys/net/netisr.c:859
> #19 0xc07b8af0 in netisr_dispatch (proto=1, m=0xc6567a00) at
> /usr/src/sys/net/netisr.c:946
> #20 0xc07ae5e1 in ether_demux (ifp=0xc56ed800, m=0xc6567a00) at
> /usr/src/sys/net/if_ethersubr.c:894
> #21 0xc07aeb5f in ether_input (ifp=0xc56ed800, m=0xc6567a00) at
> /usr/src/sys/net/if_ethersubr.c:753
> #22 0xc09977b2 in nfe_int_task (arg=0xc56ff000, pending=1) at
> /usr/src/sys/dev/nfe/if_nfe.c:2187
> #23 0xc07387ca in taskqueue_run_locked (queue=0xc5702440) at
> /usr/src/sys/kern/subr_taskqueue.c:248
> #24 0xc073895c in taskqueue_thread_loop (arg=0xc56ff130) at
> /usr/src/sys/kern/subr_taskqueue.c:385
> #25 0xc06d1027 in fork_exit (callout=0xc07388a0 <taskqueue_thread_loop>,
> arg=0xc56ff130, frame=0xc538ed28)
>     at /usr/src/sys/kern/kern_fork.c:861
> #26 0xc09a5c24 in fork_trampoline () at
> /usr/src/sys/i386/i386/exception.s:275

1. info threads
2. Find the index value that matches the tid in question (in the above
   spin lock panic, that'd be tid 100109).  The index value will be
   the first number shown on the left
3. thread {index}
4. bt

If this doesn't work, alternatively you can try (from the beginning)
"thread apply all bt" and provide the output from that.  (It will be
quite lengthy, and at this point I think tid 100109 is the one of
interest in this crash, based on what Andriy said earlier)

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |



More information about the freebsd-stable mailing list