[PATCH] Support for S5 (soft power off) in bhyve

John Baldwin jhb at freebsd.org
Mon Dec 23 22:08:56 UTC 2013


On Monday, December 23, 2013 5:01:45 pm Neel Natu wrote:
> Hi John,
> 
> This looks good - thanks for adding this support!
> 
> I noticed that the RESET_REGISTER is not being advertised in the FACP
> flags field. So, perhaps the access to 0xCF9 that you saw during
> testing was triggered by cpu_reset_real() as opposed to AcpiReset()?

Yeah, that might explain why I had to comment out the 0x64 check in
bhyverun.c directly as AcpiReset() should take precedence before
cpu_reset_real() is invoked.  I'll retest that tomorrow to make sure I have
the FACP correct.

> best
> Neel
> 
> On Mon, Dec 23, 2013 at 11:43 AM, John Baldwin <jhb at freebsd.org> wrote:
> > One minor nit I've found annoying is that there is no easy way to break out of
> > the loop in vmrun.sh.  Breaking into the loader prompt and typing 'reboot'
> > does work, but it requires you to be on the console at the time.  What I
> > really want is for 'shutdown -p' or 'poweroff' inside a guest to cleanly shut
> > down and exit the loop in vmrun.sh.  To that end, I've implemented support for
> > a few more registers (such of which are non-optional in ACPI) including the
> > Reset Control register (0xcf9), ACPI Power Management 1 Event registers
> > (PM1_EVT) and the ACPI Power Management 1 Control register (PM1_CNT).  I added
> > a valid _S5 package and catch writes of the value _S5 specifies to ask bhyve
> > to exit gracefully but with an exit code of 1 (so the loop in vmrun
> > terminates).  Most of the PM1 support is very simple (it doesn't support
> > signalling anything for external events as most of them don't make sense).  It
> > does perform a reset for writes to the 0xcf9 register however, and also
> > advertises 0xcf9 as the ACPI reset register.  (Tested by commenting out the
> > current 0x64 hack.)
> >
> > One thing I had to do was make it possible for an I/O port to request a reset
> > or poweroff, so I added some #define's for additional return codes from
> > in/out handlers.  I haven't gone through and changed any existing handlers to
> > use the new constants, but the new handlers make use of INOT_RESET and
> > INOUT_POWEROFF.  Note that this also means that the current 0x64 hack could be
> > reimplemented as an in-out handler rather than a special case.  I did not
> > explicitly catch VMEXIT_POWEROFF and use a custom exit value in the main loop
> > in bhyverun.c.  Instead, I let it fall through to the default and do an
> > exit(1).  The patch is all in bhyve itself:
> >
> > --- //depot/vendor/freebsd/src/usr.sbin/bhyve/Makefile
> > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/Makefile
> > @@ -10,7 +10,7 @@
> >  SRCS=  acpi.c atpic.c bhyverun.c block_if.c consport.c dbgport.c elcr.c
> >  SRCS+=  inout.c legacy_irq.c mem.c mevent.c mptbl.c pci_ahci.c
> >  SRCS+= pci_emul.c pci_hostbridge.c pci_lpc.c pci_passthru.c pci_virtio_block.c
> > -SRCS+= pci_virtio_net.c pci_uart.c pit_8254.c pmtmr.c post.c rtc.c
> > +SRCS+= pci_virtio_net.c pci_uart.c pit_8254.c pm.c pmtmr.c post.c rtc.c
> >  SRCS+= uart_emul.c virtio.c xmsr.c spinup_ap.c
> >
> >  .PATH: ${.CURDIR}/../../sys/amd64/vmm
> > --- //depot/vendor/freebsd/src/usr.sbin/bhyve/acpi.c
> > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/acpi.c
> > @@ -85,6 +85,8 @@
> >  #define BHYVE_ASL_SUFFIX       ".aml"
> >  #define BHYVE_ASL_COMPILER     "/usr/sbin/iasl"
> >
> > +#define BHYVE_PM1A_EVT_ADDR    0x400
> > +#define BHYVE_PM1A_CNT_ADDR    0x404
> >  #define BHYVE_PM_TIMER_ADDR    0x408
> >
> >  static int basl_keep_temps;
> > @@ -332,9 +344,11 @@
> >         EFPRINTF(fp, "[0001]\t\tACPI Disable Value : 00\n");
> >         EFPRINTF(fp, "[0001]\t\tS4BIOS Command : 00\n");
> >         EFPRINTF(fp, "[0001]\t\tP-State Control : 00\n");
> > -       EFPRINTF(fp, "[0004]\t\tPM1A Event Block Address : 00000000\n");
> > +       EFPRINTF(fp, "[0004]\t\tPM1A Event Block Address : %08X\n",
> > +                BHYVE_PM1A_EVT_ADDR);
> >         EFPRINTF(fp, "[0004]\t\tPM1B Event Block Address : 00000000\n");
> > -       EFPRINTF(fp, "[0004]\t\tPM1A Control Block Address : 00000000\n");
> > +       EFPRINTF(fp, "[0004]\t\tPM1A Control Block Address : %08X\n",
> > +                BHYVE_PM1A_CNT_ADDR);
> >         EFPRINTF(fp, "[0004]\t\tPM1B Control Block Address : 00000000\n");
> >         EFPRINTF(fp, "[0004]\t\tPM2 Control Block Address : 00000000\n");
> >         EFPRINTF(fp, "[0004]\t\tPM Timer Block Address : %08X\n",
> > @@ -397,10 +411,10 @@
> >         EFPRINTF(fp, "[0001]\t\tBit Width : 08\n");
> >         EFPRINTF(fp, "[0001]\t\tBit Offset : 00\n");
> >         EFPRINTF(fp, "[0001]\t\tEncoded Access Width : 01 [Byte Access:8]\n");
> > -       EFPRINTF(fp, "[0008]\t\tAddress : 0000000000000001\n");
> > +       EFPRINTF(fp, "[0008]\t\tAddress : 0000000000000CF9\n");
> >         EFPRINTF(fp, "\n");
> >
> > -       EFPRINTF(fp, "[0001]\t\tValue to cause reset : 00\n");
> > +       EFPRINTF(fp, "[0001]\t\tValue to cause reset : 06\n");
> >         EFPRINTF(fp, "[0003]\t\tReserved : 000000\n");
> >         EFPRINTF(fp, "[0008]\t\tFACS Address : 00000000%08X\n",
> >             basl_acpi_base + FACS_OFFSET);
> > @@ -412,7 +426,8 @@
> >         EFPRINTF(fp, "[0001]\t\tBit Width : 20\n");
> >         EFPRINTF(fp, "[0001]\t\tBit Offset : 00\n");
> >         EFPRINTF(fp, "[0001]\t\tEncoded Access Width : 02 [Word Access:16]\n");
> > -       EFPRINTF(fp, "[0008]\t\tAddress : 0000000000000001\n");
> > +       EFPRINTF(fp, "[0008]\t\tAddress : 00000000%08X\n",
> > +           BHYVE_PM1A_EVT_ADDR);
> >         EFPRINTF(fp, "\n");
> >
> >         EFPRINTF(fp,
> > @@ -431,7 +446,8 @@
> >         EFPRINTF(fp, "[0001]\t\tBit Width : 10\n");
> >         EFPRINTF(fp, "[0001]\t\tBit Offset : 00\n");
> >         EFPRINTF(fp, "[0001]\t\tEncoded Access Width : 02 [Word Access:16]\n");
> > -       EFPRINTF(fp, "[0008]\t\tAddress : 0000000000000001\n");
> > +       EFPRINTF(fp, "[0008]\t\tAddress : 00000000%08X\n",
> > +           BHYVE_PM1A_CNT_ADDR);
> >         EFPRINTF(fp, "\n");
> >
> >         EFPRINTF(fp,
> > @@ -603,6 +619,11 @@
> >         EFPRINTF(fp, "DefinitionBlock (\"bhyve_dsdt.aml\", \"DSDT\", 2,"
> >                  "\"BHYVE \", \"BVDSDT  \", 0x00000001)\n");
> >         EFPRINTF(fp, "{\n");
> > +       EFPRINTF(fp, "  Name (_S5, Package (0x02)\n");
> > +       EFPRINTF(fp, "  {\n");
> > +       EFPRINTF(fp, "      0x05,\n");
> > +       EFPRINTF(fp, "      Zero,\n");
> > +       EFPRINTF(fp, "  })\n");
> >         EFPRINTF(fp, "  Scope (_SB)\n");
> >         EFPRINTF(fp, "  {\n");
> >         EFPRINTF(fp, "    Device (PCI0)\n");
> > --- //depot/vendor/freebsd/src/usr.sbin/bhyve/bhyverun.c
> > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/bhyverun.c
> > @@ -72,6 +72,7 @@
> >  #define        VMEXIT_RESTART          2       /* restart current instruction */
> >  #define        VMEXIT_ABORT            3       /* abort the vm run loop */
> >  #define        VMEXIT_RESET            4       /* guest machine has reset */
> > +#define        VMEXIT_POWEROFF         5       /* guest machine has powered off */
> >
> >  #define MB             (1024UL * 1024)
> >  #define GB             (1024UL * MB)
> > @@ -296,12 +297,17 @@
> >                  return (vmexit_handle_notify(ctx, vme, pvcpu, eax));
> >
> >         error = emulate_inout(ctx, vcpu, in, port, bytes, &eax, strictio);
> > -       if (error == 0 && in)
> > +       if (error == INOUT_OK && in)
> >                 error = vm_set_register(ctx, vcpu, VM_REG_GUEST_RAX, eax);
> >
> > -       if (error == 0)
> > +       switch (error) {
> > +       case INOUT_OK:
> >                 return (VMEXIT_CONTINUE);
> > -       else {
> > +       case INOUT_RESET:
> > +               return (VMEXIT_RESET);
> > +       case INOUT_POWEROFF:
> > +               return (VMEXIT_POWEROFF);
> > +       default:
> >                 fprintf(stderr, "Unhandled %s%c 0x%04x\n",
> >                         in ? "in" : "out",
> >                         bytes == 1 ? 'b' : (bytes == 2 ? 'w' : 'l'), port);
> > --- //depot/vendor/freebsd/src/usr.sbin/bhyve/inout.h
> > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/inout.h
> > @@ -33,6 +33,12 @@
> >
> >  struct vmctx;
> >
> > +/* Handler return values. */
> > +#define        INOUT_ERROR     -1
> > +#define        INOUT_OK        0
> > +#define        INOUT_RESET     1
> > +#define        INOUT_POWEROFF  2
> > +
> >  typedef int (*inout_func_t)(struct vmctx *ctx, int vcpu, int in, int port,
> >                             int bytes, uint32_t *eax, void *arg);
> >
> > --- /dev/null
> > +++ //depot/user/jhb/bhyve/usr.sbin/bhyve/pm.c
> > @@ -0,0 +1,139 @@
> > +/*-
> > + * Copyright (c) 2013 Advanced Computing Technologies LLC
> > + * Written by: John H. Baldwin <jhb at FreeBSD.org>
> > + * All rights reserved.
> > + *
> > + * Redistribution and use in source and binary forms, with or without
> > + * modification, are permitted provided that the following conditions
> > + * are met:
> > + * 1. Redistributions of source code must retain the above copyright
> > + *    notice, this list of conditions and the following disclaimer.
> > + * 2. Redistributions in binary form must reproduce the above copyright
> > + *    notice, this list of conditions and the following disclaimer in the
> > + *    documentation and/or other materials provided with the distribution.
> > + *
> > + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
> > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> > + * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
> > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
> > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> > + * SUCH DAMAGE.
> > + */
> > +
> > +#include <sys/cdefs.h>
> > +__FBSDID("$FreeBSD");
> > +
> > +#include <sys/types.h>
> > +
> > +#include "inout.h"
> > +
> > +#define        PM1A_EVT_ADDR   0x400
> > +#define        PM1A_CNT_ADDR   0x404
> > +
> > +/*
> > + * Reset Control register at I/O port 0xcf9.  Bit 2 forces a system
> > + * reset when it transitions from 0 to 1.  Bit 1 selects the type of
> > + * reset to attempt: 0 selects a "soft" reset, and 1 selects a "hard"
> > + * reset.
> > + */
> > +static int
> > +reset_handler(struct vmctx *ctx, int vcpu, int in, int port, int bytes,
> > +    uint32_t *eax, void *arg)
> > +{
> > +       static uint8_t reset_control;
> > +
> > +       if (bytes != 1)
> > +               return (-1);
> > +       if (in)
> > +               *eax = reset_control;
> > +       else {
> > +               reset_control = *eax;
> > +
> > +               /* Treat hard and soft resets the same. */
> > +               if (reset_control & 0x4)
> > +                       return (INOUT_RESET);
> > +       }
> > +       return (0);
> > +}
> > +INOUT_PORT(reset_reg, 0xCF9, IOPORT_F_INOUT, reset_handler);
> > +
> > +/*
> > + * Power Management 1 Event Registers
> > + *
> > + * bhyve doesn't support any power management events currently, so the
> > + * status register always returns zero.  The enable register preserves
> > + * its value but has no effect.
> > + */
> > +static int
> > +pm1_status_handler(struct vmctx *ctx, int vcpu, int in, int port, int bytes,
> > +    uint32_t *eax, void *arg)
> > +{
> > +
> > +       if (bytes != 2)
> > +               return (-1);
> > +       if (in)
> > +               *eax = 0;
> > +       return (0);
> > +}
> > +
> > +static int
> > +pm1_enable_handler(struct vmctx *ctx, int vcpu, int in, int port, int bytes,
> > +    uint32_t *eax, void *arg)
> > +{
> > +       static uint16_t pm1_enable;
> > +
> > +       if (bytes != 2)
> > +               return (-1);
> > +       if (in)
> > +               *eax = pm1_enable;
> > +       else
> > +               pm1_enable = *eax;
> > +       return (0);
> > +}
> > +INOUT_PORT(pm1_status, PM1A_EVT_ADDR, IOPORT_F_INOUT, pm1_status_handler);
> > +INOUT_PORT(pm1_enable, PM1A_EVT_ADDR + 2, IOPORT_F_INOUT, pm1_enable_handler);
> > +
> > +/*
> > + * Power Management 1 Control Register
> > + *
> > + * This is mostly unimplemented except that we wish to handle writes that
> > + * set SPL_EN to handle S5 (soft power off).
> > + */
> > +#define        PM1_SLP_TYP     0x1c00
> > +#define        PM1_SLP_EN      0x2000
> > +#define        PM1_ALWAYS_ZERO 0xc003
> > +
> > +static int
> > +pm1_control_handler(struct vmctx *ctx, int vcpu, int in, int port, int bytes,
> > +    uint32_t *eax, void *arg)
> > +{
> > +       static uint16_t pm1_control;
> > +
> > +       if (bytes != 2)
> > +               return (-1);
> > +       if (in)
> > +               *eax = pm1_control;
> > +       else {
> > +               /*
> > +                * Various bits are write-only or reserved, so force them
> > +                * to zero in pm1_control.
> > +                */
> > +               pm1_control = *eax & ~(PM1_SLP_EN | PM1_ALWAYS_ZERO);
> > +
> > +               /*
> > +                * If SLP_EN is set, check for S5.  Bhyve's _S5_ method
> > +                * says that '5' should be stored in SLP_TYP for S5.
> > +                */
> > +               if (*eax & PM1_SLP_EN) {
> > +                       if ((pm1_control & PM1_SLP_TYP) >> 10 == 5)
> > +                               return (INOUT_POWEROFF);
> > +               }
> > +       }
> > +       return (0);
> > +}
> > +INOUT_PORT(pm1_control, PM1A_CNT_ADDR, IOPORT_F_INOUT, pm1_control_handler);
> > +
> >
> >
> > --
> > John Baldwin
> > _______________________________________________
> > freebsd-virtualization at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
> > To unsubscribe, send any mail to "freebsd-virtualization-unsubscribe at freebsd.org"
> 

-- 
John Baldwin


More information about the freebsd-virtualization mailing list