Stop scheduler on panic
Andriy Gapon
avg at FreeBSD.org
Thu Nov 17 19:42:57 UTC 2011
on 17/11/2011 21:09 John Baldwin said the following:
> On Thursday, November 17, 2011 11:58:03 am Andriy Gapon wrote:
>> on 17/11/2011 18:37 John Baldwin said the following:
>>> On Thursday, November 17, 2011 4:47:42 am Andriy Gapon wrote:
>>>> on 17/11/2011 10:34 Andriy Gapon said the following:
>>>>> on 17/11/2011 10:15 Kostik Belousov said the following:
>>>>>> I have the following change for eons on my test boxes. Without it,
>>>>>> I simply cannot get _any_ dump.
>>>>>>
>>>>>> diff --git a/sys/cam/cam_xpt.c b/sys/cam/cam_xpt.c
>>>>>> index 10b89c7..a38e42f 100644
>>>>>> --- a/sys/cam/cam_xpt.c
>>>>>> +++ b/sys/cam/cam_xpt.c
>>>>>> @@ -4230,7 +4230,7 @@ xpt_done(union ccb *done_ccb)
>>>>>> TAILQ_INSERT_TAIL(&cam_simq, sim, links);
>>>>>> mtx_unlock(&cam_simq_lock);
>>>>>> sim->flags |= CAM_SIM_ON_DONEQ;
>>>>>> - if (first)
>>>>>> + if (first && panicstr == NULL)
>>>>>> swi_sched(cambio_ih, 0);
>>>>>> }
>>>>>> }
>>>>>
>>>>> I think that this (or similar) change should go into the patch and the tree.
>>>>>
>>>>
>>>> And, BTW, I still would like to do something like the following (perhaps with
>>>> td_oncpu = NOCPU and td_flags &= ~TDF_NEEDRESCHED also moved to the common code):
>>>>
>>>> Index: sys/kern/sched_ule.c
>>>> ===================================================================
>>>> --- sys/kern/sched_ule.c (revision 227608)
>>>> +++ sys/kern/sched_ule.c (working copy)
>>>> @@ -1790,7 +1790,6 @@ sched_switch(struct thread *td, struct thread *new
>>>> td->td_oncpu = NOCPU;
>>>> if (!(flags & SW_PREEMPT))
>>>> td->td_flags &= ~TDF_NEEDRESCHED;
>>>> - td->td_owepreempt = 0;
>>>> tdq->tdq_switchcnt++;
>>>> /*
>>>> * The lock pointer in an idle thread should never change. Reset it
>>>> Index: sys/kern/kern_synch.c
>>>> ===================================================================
>>>> --- sys/kern/kern_synch.c (revision 227608)
>>>> +++ sys/kern/kern_synch.c (working copy)
>>>> @@ -406,6 +406,8 @@ mi_switch(int flags, struct thread *newtd)
>>>> ("mi_switch: switch must be voluntary or involuntary"));
>>>> KASSERT(newtd != curthread, ("mi_switch: preempting back to ourself"));
>>>>
>>>> + td->td_owepreempt = 0;
>>>> +
>>>> /*
>>>> * Don't perform context switches from the debugger.
>>>> */
>>>> Index: sys/kern/sched_4bsd.c
>>>> ===================================================================
>>>> --- sys/kern/sched_4bsd.c (revision 227608)
>>>> +++ sys/kern/sched_4bsd.c (working copy)
>>>> @@ -940,7 +940,6 @@ sched_switch(struct thread *td, struct thread *new
>>>> td->td_lastcpu = td->td_oncpu;
>>>> if (!(flags & SW_PREEMPT))
>>>> td->td_flags &= ~TDF_NEEDRESCHED;
>>>> - td->td_owepreempt = 0;
>>>> td->td_oncpu = NOCPU;
>>>>
>>>> /*
>>>>
>>>> Does anybody see any potential problems with such a change?
>>>
>>> Hmm, does this mean the preemption will be lost if you break into the
>>> debugger and continue in the non-panic case?
>>
>> Not sure which exact scenario you have in mind.
>> Please note that the above diff just moves resetting of td_owepreempt to an
>> earlier place. As far as I can see there are no checks of td_owepreempt value
>> between the new place and the old places.
>
> I'm worried that you are clearing td_owepreempt even in cases where a context
> switch is not performed. So say you enter DDB with td_owepreempt set and that
> DDB bails on a context switch. With your change it will now clear td_owepreempt
> and "lose" the preemption.
>
And without the change we get the recursion and double-fault because of
kdb_switch -> thread_unlock -> spinlock_exit -> critical_exit ->
mi_switch in this case ?
BTW, it is my opinion that we really should not let the debugger code call
mi_switch for any reason.
--
Andriy Gapon
More information about the freebsd-current
mailing list