svn commit: r247600 - in head/sys: conf sparc64/pci

Bruce Evans brde at optusnet.com.au
Sat Mar 2 06:34:50 UTC 2013


On Sat, 2 Mar 2013, Marius Strobl wrote:

> Log:
>  - While Netra X1 generally show no ill effects when registering a power
>    fail interrupt handler, there seems to be either a broken batch of them
>    or a tendency to develop a defect which causes this interrupt to fire
>    inadvertedly. Given that apart from this problem these machines work
>    just fine, add a tunable allowing the setup of the power fail interrupt
>    to be disabled.
>    While at it, remove the DEBUGGER_ON_POWERFAIL compile time option and
>    make that behavior also selectable via the newly added tunable.
>  - Apparently, it's no longer a problem to call shutdown_nice(9) from within
>    an interrupt filter (some other drivers in the tree do the same). So
>    change the power fail interrupt from an handler in order to simplify the
>    code and get rid of a !INTR_MPSAFE handler.

Gak!  It it is any error to call any() from within a fast interrupt
handler.  Even with fast interrupt handlers broken to be interrupt
filters, it is an error to call almost any().  shutdown_nice() is an
especially invalid any().  It sends a signal to init, and uses many
sleep locks for this.  So you have the interrupt filter which is locked
by critical_enter() and probably also by hard-disabling interrupts on
the current CPU, calling up to code locked by sleep mutexes.  This
asks for deadlock, and gets it when the interrupt preempts code holding
one of the sleep locks that is wandered into.

The other broken drivers that do this seem to be mainly serial console
drivers.  Their debugger entry was subverted into calling panic() or
shutdown_nice() according to an escape sequence.  Even the debugger
entry part of this was broken by changing it from a hard breakpoint
to a kdb_enter() call which does invalid things (it accesses global
state without locking, and calls printf() before entering debugger
context).  shutdown_nice() is also called from acpi and from syscons.
I think the latter still uses an ordinary (Giant locked) interrupt
handler.  Calling shutdown_nice() from there has a chance of never
deadlocking.  It would just have to wait if the interrupt interrupted
something holding shutdown_nice()'s locks.

The dangerous calls in syscons are actually from scgetc().  I think
they are reachable in debugger mode too.  Then they are invalid.  So
are ddb's commands for rebooting and panicing.  You can't call any()
from ddb either, but these commands do.  A non-broken version of these
commands (or "call any()) would exit from ddb context after arranging
to make the call using a trampoline.  The call might fail, but then
it is not the fault of ddb's context.  The trampoline is needed to
regain control if the call can return.  The reboot and panic commands
are safer than most of the unsafe ones since they are supposed to give
an unclean shutdown and if they don't work then nothing much worse than
a recursive unclean shutdown can happen.

Bruce


More information about the svn-src-all mailing list