Re: git: e4ab361e5394 - main - fix poweroff regression from 9cdf326b4f by delaying shutdown_halt

From: Warner Losh <imp_at_bsdimp.com>
Date: Tue, 06 Feb 2024 22:14:24 UTC
On Tue, Feb 6, 2024 at 3:13 AM Andriy Gapon <avg@freebsd.org> wrote:

> On 06/02/2024 11:41, Andriy Gapon wrote:
> > The branch main has been updated by avg:
> >
> > URL:
> https://cgit.FreeBSD.org/src/commit/?id=e4ab361e53945a6c3e9d68c5e5ffc11de40a35f2
> >
> > commit e4ab361e53945a6c3e9d68c5e5ffc11de40a35f2
> > Author:     Andriy Gapon <avg@FreeBSD.org>
> > AuthorDate: 2024-02-06 08:55:13 +0000
> > Commit:     Andriy Gapon <avg@FreeBSD.org>
> > CommitDate: 2024-02-06 08:55:13 +0000
> >
> >      fix poweroff regression from 9cdf326b4f by delaying shutdown_halt
> >
> >      The regression affected ACPI-based systems without EFI poweroff
> support
> >      (including VMs).
> >
> >      The key reason for the regression is that I overlooked that
> poweroff is
> >      requested by RB_POWEROFF | RB_HALT combination of flags.  In my
> opinion,
> >      that command is a bit bipolar, but since we've been doing that
> forever,
> >      then so be it.  Because of that flag combination, the order of
> >      shutdown_final handlers that check for either flag does matter.
> >
> >      Some additional complexity comes from platform-specific
> shutdown_final
> >      handlers that aim to handle multiple reboot options at once.  E.g.,
> >      acpi_shutdown_final handles both poweroff and reboot / reset.  As
> >      explained in 9cdf326b4f, such a handler must run after
> shutdown_panic to
> >      give it a chance.  But as the change revealed, the handler must
> also run
> >      before shutdown_halt, so that the system can actually power off
> before
> >      entering the halt limbo.
> >
> >      Previously, shutdown_panic and shutdown_halt had the same priority
> which
> >      appears to be incompatible with handlers that can do both poweroff
> and
> >      reset.
>
> I want to add that having many handlers with priorities expressed like
> SHUTDOWN_PRI_LAST ± N while some of those handlers have implicit
> inter-dependencies (interactions, interference) also does not help to see
> a
> clear picture.
>
> Perhaps it would be better to handle all (reasonable) RB flag combinations
> centrally in kern_reboot and then dispatch events like shutdown_reset,
> shutdown_poweroff, etc.  Handlers for those events would have a single and
> simple job of performing that one action (perhaps failing and letting
> another
> handler try).
>
> Also, I would split reboot howto into command and flag portions, so that
> only
> one command can be specified at a time.  E.g., I would consider
> RB_AUTOBOOT
> ("RB_REBOOT"), RB_POWEROFF, RB_HALT to be distinct commands.  Then, flags
> like
> RB_NOSYNC or RB_DUMP could be optional flags.
>

Part of the problem is that RB_AUTOBOOT's value is 0. And we're using bits
to
describe what to do (was the fashion in the late 80s/90s, bio used to have
its
commands as bits, not a bit field). You also didn't include RB_POWERCYCLE
which
is a new bit in this list.

It's a mess.

As an aside, some flags documented for reboot(2) do not seem to have much
> to do
> with reboot.  E.g., RB_DFLTROOT affects how a system boots up, but not how
> the
> system goes for a reboot.  Not surprisingly, that option is not handled by
> anything kicked off with reboot(2).
> Maybe, it would make more sense if we had fast reboot support and the
> running
> kernel could instruct the next kernel directly.  But, it's still a bit
> weird
> that flags like RB_POWEROFF and RB_DFLTROOT belong in the same domain and
> can be
> set together.
>

More like 'support again' since this interface is from 4BSD and hasn't been
updated
in a very long time. It made sense when you could tell the VAX's firmware
details about
the next reboot, but we don't really have that short of implementing
kexec...

Though to fix it we should maybe just have a number of handlers that are
called
at each stage, and we deal with only one bit at a time (POWERCYCLE >
POWEROFF > HALT)
and your drivers register a separate one for each...  It would be a bit
more rework
in the tree, and there'd be a few more functions called, but it would be a
minimal change.

But it kinda feels like we should just bite the bullet and have 3 handlers
for these cases.
One to power cycle, one to power off and one to halt. Then the drivers
wouldn't care which ones
have priority, they'd just check a bit and do what they are told  (or maybe
we say that they
only run when the bit is set to make that code simpler). And if one bit of
hardware can do
all three, they'd have to implement 3 handlers... Tha ambiguity would be
gone and the ordering
wouldn't matter.

Warner