[Patch] panics/hangs with preemption and threads.
Julian Elischer
julian at elischer.org
Mon Sep 13 11:08:59 PDT 2004
John Baldwin wrote:
>On Sunday 12 September 2004 02:39 am, Julian Elischer wrote:
>
>
>>Guys I think I found a (the?) major cause for the corruptions of the
>>ksegrp/thread runqueue for threaded processes when Premption is turned on..
>>
>>When a thread is scheduled in setrunqueue() the firt thing that is done
>>is that it is put in the correct place in the ksegrp's run queue,.
>>then if it is in the top N spots (where N is the defined concurrency
>>and is usually <= NCPU) it is passed down to the system scheduler
>>using sched_add().
>>Sched_add can call maybe_preempt() which can decide to switch out the
>>current thread and switch to the new one immediatly.
>>The trouble with that is that we have already put the new one on the
>>ksegrp's run queue! When that thread is next put on the run queue using
>>setrunqueue() it is already there, and we end up with an infinitly looping
>>run queue. Any code that follows that list will never end. and the system
>>will freeze.
>>
>>Here is a patch that solves it but I'm not happy about it..
>>John, you wrote the preemption code..
>>do you have any ideas about how to do this cleaner?
>>
>>One possibility is to make sched_add return a value that indicates if the
>>thread was handled immediatly. that would allow setrunqueue to only set it
>>into the ksegrp's run queue if it was not already handled.
>>
>>Other suggestions welcome.
>>
>>
>
>I think it's probably a good idea to do the preemption check before putting
>the thread on the kse group. However, that might break ULE and some things
>it does (ULE pins interrupt threads but does it in sched_add, perhaps that is
>a hack and the pinning should be done in ithread_schedule instead). Changing
>sched_add() to return a boolean similar to maybe_preempt() is probably ok as
>an alternative then. Also, there's really no need for an additional
>SRQ_NOPREEMPT flag, that just duplicates critical_enter()/critical_exit().
>The same is probably true of SRQ_YIELDING and SRQ_MYSELF (preemption already
>doesn't preempt to curthread since the priorities are equal). The place that
>uses SRQ_YIELDING can just add a critical section around the call to
>setrunqueue(). Note that when a preemption is deferred due to a nested
>critical section, the preemption doesn't actually occur until the outermost
>critical section is exited, so if you do this:
>
> mtx_lock_spin(&sched_lock);
> blah blah;
> if (foo) {
> critical_enter();
> setrunqueue(td2);
> critical_exit();
> mi_switch(NULL, SW_VOL);
> }
> mtx_unlock_spin(&sched_lock);
>
>That won't actually preempt.
>
The flags were not only for this problem but also for another scheduer
I was playign with and
I thought they might be useful in trying to find/fix this problem.
The critical nest thing is ok, but I had a lot of debug code in at one
stage and I wanted to know
more about where I had come from. so the flags I had already gave me that..
I don't see the harm in having more information but I realised
afterwards that there was already some of this info available..
for MYSELF, curthead == newtd and for INTR teh process/thread is marked
as an interrupt thread, which leaves only "Yielding"
which I do think is useful info to know. but the critnest solves teh
same problem in a more specific manner.
>
>
>
More information about the freebsd-current
mailing list