common entry point for "software" and "hardware" "panics"

Tue Aug 23 16:39:05 UTC 2011

On Tue, Aug 23, 2011 at 7:09 AM, Andriy Gapon <avg at freebsd.org> wrote:
>
> Too many quote signs in the subject line... let me try to explain.
>
> Currently we have two sources of detecting some trouble/inconsistency that
> requires a system panic/reset or debugging.  One source is various checks in the
> program code (e.g. KASSERTs) that call panic() when a fatal inconsistency is
> detected.  The other source is the hardware that generates a trap when something
> is wrong from its point of view.  In this case the trap need not be a fatal one,
> so the software (the kernel) checks a type of trap and decides whether the
> condition is fatal.  But let's distinguish the purely software source from the
> hardware+software source.
>
> Depending on the kernel options/configuration the kernel can also react in
> different ways to the fatal conditions.  One way is to call panic(9) , the other
> way is to call kdb_trap.  But it's even a little bit more complicated than that.
>
> So, let's consider some possibilities.
>
> !KDB, software problem:
> panic -> kern_reboot
>
> !KDB, fatal trap:
> trap -> trap_fatal -> panic -> kern_reboot
>
> KDB, !KDB_UNATTENDED, software problem:
> panic -> kdb_enter -> breakpoint ~> trap -> kdb_trap
>
> KDB, !KDB_UNATTENDED, fatal trap:
> trap -> trap_fatal -> kdb_trap
>
> Also, kdb key from console:
> kdb_enter -> breakpoint ~> trap -> kdb_trap
>
> panic key from console:
> kdb_panic -> panic -> ...
>
> and also some code calls kdb_enter instead of panic in situations that require
> debugging:
> kdb_enter -> breakpoint ~> kdb_trap
>
> So, we can see that in these examples that currently we do not have a function
> that would be called in all the cases.
> I think that it would be nice if we had some sort of a (semi-)universal front-end
> to panic and kdb_trap.  E.g. it could be useful for some common tasks like
> stopping other CPUs in SMP environment.  Then, it could be useful for printing
> some information useful in both cases like e.g. a stack trace.  Or perhaps
> deciding whether KDB should be actually entered in a common place.
>
> Unfortunately, this is not a proposal, just sort of musings on the topic.
> Does anybody have some more concrete ideas here?
> Thank you!

I vote for the status quo. :-)

That is, it seems to me that the intent behind kdb_enter() and panic()
are very different.  With a software fault panic is usually the right
thing (since we have no way at the moment to e.g. restart the VM
subsystem).  debugger_on_panic then gets you a debugger if desired.
kdb_enter() or breakpoint() should not be in "production" code since
there may be no debugger.  It seems useful to me only for intermediate
debugging, and any particular use should go away when the problem is
known and fixed.

Cheers,
matthew