Stop scheduler on panic

Fri Nov 18 21:59:38 UTC 2011

on 17/11/2011 23:38 John Baldwin said the following:
> On Thursday, November 17, 2011 4:35:07 pm John Baldwin wrote:
>> Hmmm, you could also make critical_exit() not perform deferred preemptions
>> if SCHEDULER_STOPPED?  That would fix the recursion and still let the
>> preemption "work" when resuming from the debugger?

Yes, that's a good solution, I think.  I just didn't want to touch such a "low
level" code, but I guess there is no rational reason for that.

> Or you could let the debugger run in a permament critical section (though
> perhaps that breaks too many other things like debugger routines that try
> to use locks like the 'kill' command (which is useful!)).  Effectively what you
> are trying to do is having the debugger run in a critical section until the
> debugger is exited.  It would be cleanest to let it run that way explicitly
> if possible since then you don't have to catch as many edge cases.

I like this idea, but likely it would take some effort to get done.

Related to this is something that I attempted to discuss before.  I think that
because the debugger acts on a frozen system image (the debugger is a sole actor
and observer), the debugger should try to minimize its interaction with the
debugged system.  In this vein I think that the debugger should also bypass any
locks just like with SCHEDULER_STOPPED.  The debugger should also be careful to
note a state of any subsystems that it uses (e.g. the keyboard) and return them
to the initial state if it had to be changed.  But that's a bit different story.
 And I really get beyond my knowledge zone when I try to think about things like
handling 'call func_xxxx' in the debugger where func_xxxx may want to acquire
some locks or noticeably change some state within a system.

But to continue about the locks... I have this idea to re-implement
SCHEDULER_STOPPED as some more general check that could be abstractly denoted as
LOCKING_POLICY_CHECK(context).  Where the context could be defined by flags like
normal, in-panic, in-debugger, etc.  And the locking policies could be: normal,
bypass, warn, panic, etc.

However, I am not sure if this could be useful (and doable properly) in
practice.  I am just concerned with the interaction between the debugger and the
locks.  It still seems to me inconsistent that we are going with
SCHEDULER_STOPPED for panic, but we are continuing to use "if (!kdb_active)"
around some locks that could be problematic under kdb (e.g. in USB).  In my
opinion the amount of code shared between normal context and kdb context is
about the same as amount of code shared between normal context and panic
context.  But I haven't really quantified those.

-- 
Andriy Gapon