svn commit: r331298 - head/sys/dev/syscons
Warner Losh
imp at bsdimp.com
Thu Mar 22 00:31:50 UTC 2018
On Wed, Mar 21, 2018 at 11:53 AM, Bruce Evans <brde at optusnet.com.au> wrote:
> On Wed, 21 Mar 2018, Warner Losh wrote:
>
> Log:
>> Unlock giant when calling shutdown_nice()
>>
>
> This breaks the driver. Giant is syscons' driver lock, and also the
> interrupt handler lock for at least the atkbd keyboard driver, so vt
> sometimes holds the lock for.
>
OK. I got carried away. You're right. The proper fix is to unlock Giant at
the top of kern_reboot() instead. This handles the case where we call
shutdown_nice() with no init running and have to call kern_reboot directly.
Otherwise it's perfectly fine to just call shutdown_nice() with Giant held
since we just signal init from there. So I'll revert this change and make
that other change instead.
> vt has to sprinkle lots of Giant locking and unlocking where the
> corresponding locking is automatic for syscons, but this seems to be
> nonsense in its event handler. vt has to acquire Giant when calling
> the keyboard driver, but event handling is a call from the keyboard
> driver, usually from the interrupt handler with a suitable lock held.
> For atkbd, the suitable lock is Giant -- see atkbd_intr(). (All keyboard
> drivers that use the kbd(4) for the top level use Giant looking since
> toe top level specifies D_NEEDGIANT in its cdevsw. I think that is all
> keyboard drivers.) So where vt_kbdevent() sprinkles Giant, that has no
> effect since Giant is held by the caller. So vt_kbdevent() calls
> vt_processkey() with Giant held despite it not acquiring Giant explicitly;
> vt_processkey() calls vt_machine_kbdevent() and that calls various
> shutdown functions, all with Giant held.
True, and covered above.
> Modified: head/sys/dev/syscons/syscons.c
>> ============================================================
>> ==================
>> --- head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:08 2018
>> (r331297)
>> +++ head/sys/dev/syscons/syscons.c Wed Mar 21 14:47:12 2018
>> (r331298)
>> @@ -3858,22 +3858,28 @@ next_code:
>>
>> case RBT:
>> #ifndef SC_DISABLE_REBOOT
>> - if (enable_reboot && !(flags & SCGETC_CN))
>> + if (enable_reboot && !(flags & SCGETC_CN)) {
>> + mtx_unlock(&Giant);
>> shutdown_nice(0);
>> + }
>> #endif
>> break;
>>
>> case HALT:
>> #ifndef SC_DISABLE_REBOOT
>> - if (enable_reboot && !(flags & SCGETC_CN))
>> + if (enable_reboot && !(flags & SCGETC_CN)) {
>> + mtx_unlock(&Giant);
>> shutdown_nice(RB_HALT);
>> + }
>> #endif
>> break;
>>
>> case PDWN:
>> #ifndef SC_DISABLE_REBOOT
>> - if (enable_reboot && !(flags & SCGETC_CN))
>> + if (enable_reboot && !(flags & SCGETC_CN)) {
>> + mtx_unlock(&Giant);
>> shutdown_nice(RB_HALT|RB_POWEROFF);
>> + }
>> #endif
>> break;
>>
>
> The new bugs should cause assertion failures. shutdown_nice() just
> signals init and returns. Giant is not re-acquired, so assertions
> should fail when the caller drops the lock. So the shutdown should
> be a nasty panic.
>
I must have been using vt when I tested it, since I didn't see that.
> Transiently dropping the lock is probably not fatal depending on what
> the caller does. On x86 with atkbd, nested interrupts are prevented
> by masking the in the ATPIC or APIC -- Giant is not really the driver
> lock, but just a lock for interfacing between related screen and keyboard
> drivers.
>
> Serial console drivers with fast interrupt handlers have much more
> broken locking for ddb special keys. It is invalid to either drop locks
> or call the "any" function from a fast interrupt handler, but buggy
> serial console drivers calls kbd_alt_break(), and that now calls
> shutdown_nice() and other functions that cannot be called from a fast
> interrupt handler. ddb keys supply most of the shutdown_nice()
> functionality for serial consoles, and there are no escape sequence to
> get this without ddb or maybe another debugger, so these bugs don't
> affect most users.
>
It's called it before my changes... I'll make a note to look into this.
> Handling this correctly requires much the same fix as an unsafe signal
> handler, and fixes have much the same problems -- not much more than
> setting a flag is safe, and the flag might never be looked at if the
> system is in a bad state. However, if a nice shutdown is possible then
> the sytem must be in a good enough state to poll for flags.
>
> For normal signal handlers, there is no problem sending a signal to init
> or at least with setting a flag and waking up some thread to check the
> flag.
>
> I don't quite understand this commit. It should be valid to send a
> signal to init() in proc or non-fast ithread context including with
> Giant held. shutdown_nice() starts with PROC_LOCK(initproc), so it
> would be a LOR to call it with a spinlock held, and is even more
> obviously wrong to call it from kbd_alt_break() with interrupts masked
> and possibly a spinlock held, but Giant is a special sleep lock that
> causes fewer LORs than most sleep locks.
>
Yes, this commit is wrong.
> Actual testing shows that doesn't cause a panic, but it also doesn't
> actually unlock for shutdown_nice(), since the lock is acquired twice
> and only released once. syscons has much the same extra lock sprinkling
> for event handling as vt:
>
> - intr_event_execute_handlers() acquires Giant and calls atkbdintr()
> - atkbdintr() calls sckbdevent()
> - sckbdevent() unnecessarily acquires Giant again
> - the buggy unlocking drops Giant just once
> - shutdown_nice() is called with Giant held
> - the buggy unlocking fails to re-acquire Giant
> - sckbdevent() releases Giant, leaving it not held
> - sckbdevent() returns
> - atkbdintr() returns
> - intr_event_execute_handlers() releases Giant. This should panic, but
> it apparently blocks for long enough for init to shut down first.
>
> When I trace the last step, I get a panic which might be either from the
> different timing or just a bug in kdb.
Good point. I think the following change is good for everything except
calling shutdown_nice() from a fast interrupt handler with noinit running:
diff --git a/sys/kern/kern_shutdown.c b/sys/kern/kern_shutdown.c
index e5ea9644ad3f..564aecd811be 100644
--- a/sys/kern/kern_shutdown.c
+++ b/sys/kern/kern_shutdown.c
@@ -366,6 +366,12 @@ kern_reboot(int howto)
{
static int once = 0;
+ /*
+ * Drop Giant once and for all.
+ */
+ while (mtx_owned(&Giant))
+ mtx_unlock(&Giant);
+
#if defined(SMP)
/*
* Bind us to the first CPU so that all shutdown code runs there.
Some
Comments?
Warner
More information about the svn-src-head
mailing list