System Hang

Jorn Argelo jorn at wcborstel.nl
Fri May 21 00:15:14 PDT 2004


Nicholas Bernstein wrote:

>hello all,
>I'm hoping someone can give me a hand with this. I have a suspicion as
>to what is causing this, but I don't want to "taint" any replies I get.
>If any of knowledgeable folks out there could help me out, offer
>possible areas to look into, better places to contact, or anything that
>could possibly be helpful, I would really, really appreciate it. 
>
>					thanks in advance,
>					Nick
>Info follows: 
>
>
>System: 
>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>HP Proliant DL140 
>http://h18004.www1.hp.com/products/servers/proliantdl140/index.html
>FreeBSD 5.2-CURRENT #0 standard kernel
>2 xeon (hyperthreaded) processors
>
>Problem Description: 
>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>
>System becomes unresponsive and "hangs". System does not respond to
>keyboard, network or any other type of input. 
>
>
>Error Message:
>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>
>panic: Assertion TD_ON_SLEEPQ(td) failed at
>/usr/src/sys/kern/subr_sleepqueue.c:783 at line 783 in file:
>/usr/src/sys/kern/subr_sleepqueue.c
>cpuid=1
>Debugger("panic")
>Spin lick sched lock held by 0x617eb00 for > 5
>
>Related info:
>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>
>/usr/src/sys/kern/subr_sleepqueue.c:
>...
>    770 /*
>    771  * Abort a thread as if an interrupt had occured.  Only abort
>    772  * interruptable waits (unfortunately it isn't safe to abort
>others).
>    773  *
>    774  * XXX: What in the world does the comment below mean?
>    775  * Also, whatever the signal code does...
>    776  */
>    777 void
>    778 sleepq_abort(struct thread *td)
>    779 {
>    780         void *wchan;
>    781
>    782         mtx_assert(&sched_lock, MA_OWNED);
>    783         MPASS(TD_ON_SLEEPQ(td));
>    784         MPASS(td->td_flags & TDF_SINTR);
>    785
>    786         /*
>    787          * If the TDF_TIMEOUT flag is set, just leave. A
>    788          * timeout is scheduled anyhow.
>    789          */
>    790         if (td->td_flags & TDF_TIMEOUT)
>    791                 return;
>    792
>    793         CTR3(KTR_PROC, "sleepq_abort: thread %p (pid %d, %s)",
>td,
>    794             td->td_proc->p_pid, td->td_proc->p_comm);
>    795         wchan = td->td_wchan;
>    796         mtx_unlock_spin(&sched_lock);
>    797         sleepq_remove(td, wchan);
>    798         mtx_lock_spin(&sched_lock);
>    799 } 
>
>
>Also, in order for the machine to detect it's broadcom 5700 network
>cards, I had to the line 
>	acpi_load="no"
>to my /boot/loader.conf. Upon reboot the network cards show up in an
>ifconfig and work perfectly. 
>
>
>Possible references:
>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>
>This isn't the exact same error, but it's the closest thing I could find
>to my error: 
>
>http://lists.freebsd.org/pipermail/freebsd-current/2004-March/022633.html
>
>This error is also pretty close, but not the same thing:
>
>http://groups.google.com/groups?q=%27panic:+Assertion+TD_ON_SLEEPQ(td)+failed+at%27&hl=en&lr=&ie=UTF-8&safe=off&selm=200405182105.04275.thierry%40herbelot.com&rnum=1
>
>
>				. . . 
>		Thanks for taking the time to read this. 
>				. . . 
>
>  
>
I wonder ... why do want to run CURRENT on a machine like that? It's the 
bleeding edge source code, which is unstable most of the times. You 
might want to consider running 4.9 on that machine, which is the 
production release. You can try 5.2.1 as well, but it still falls under 
the unstable branch.

So in other words, post your error at the CURRENT mailing list, and 
switch back to 4.9. I think you will solve many problems with that.

Cheers,

Jorn



More information about the freebsd-questions mailing list