Unkillable KSE threaded proc
Andrew Gallatin
gallatin at cs.duke.edu
Fri Sep 17 04:47:39 PDT 2004
Julian Elischer writes:
> John Baldwin wrote:
> > On Thursday 16 September 2004 09:42 am, Andrew Gallatin wrote:
> >
> >>Julian Elischer writes:
> >> > Andrew, please try -current on ts own now..
> >> > I have checked in some fixes that have helped others.
> >>
> >>OK, preemption off... Still a system lockup, but a little different.
> >>
> >>The interesting thing here is that continuing and breaking into the
> >>debugger repeatedly seems to show that thread 0xc1646af0 is looping in
> >>exit. I've seen him in thread_single, thread_suspend_check, and in
> >>exit itself at kern_exit.c:163, etc. A breakpoint in
> >>thread_suspend_one never triggers, so I guess he's holding the proc
> >>lock and just looping forever. A breakpoint in _mtx_assert() shows
> >>him asserting the proc lock in thread_suspend_check at kern_thread.c:898.
> >>Over and over.
> >
> >
> > There is definitely some sort of infinite loop here. Stripping out the
> > comments in exit1() for that section of code reveals basically:
> >
> > PROC_LOCK(p);
> > if (p->p_flag & P_HADTHREADS) {
> > retry:
> > thread_suspend_check(0);
> > if (thread_single(SINGLE_EXIT))
> > goto retry;
> > }
> > p->p_flag |= P_WEXIT;
> > PROC_UNLOCK(p);
> >
> > So it's easy to see how it can stuck in a loop I think. If thread_single()
> > never drops the lock then other threads that are waiting to die can't
> > actually wait because they can never get the proc lock so that they can die.
> >
>
>
> hmm intersting..
> but this code hasn't changed in ages...
>
>
> in thread_single we see:
>
> thread_suspend_one(td);
> PROC_UNLOCK(p);
> mi_switch(SW_VOL, NULL);
> mtx_unlock_spin(&sched_lock);
> PROC_LOCK(p);
> mtx_lock_spin(&sched_lock);
>
> so when it sleeps it releases the proc lock.
But that's the problem. As I said above, break in thread_suspend_one
never triggers, so this code is never called. It must be bailing
out of thread_suspend_one() before this happens.
Did somebody fix ddb? If yes, I can try stepping through it if you like.
Maybe a quick fix would be to drop the proc lock and tsleep for a
clock tick at the bottom of the infinate loop...
Drew
More information about the freebsd-threads
mailing list