suspend/resume on BHyVe

Takuya ASADA syuu at
Wed Mar 27 09:59:34 UTC 2013


I had some discussion with Iori about the project since last year, and now I'm
suggesting him to apply Google Summer of Code'13 with the project.
(GSoC'13 will start next month)

 > For this, I think those below must be implemented.
> >
> >   - virtual machine state command interface
> >   - saving registers per CPU
> >   - dumping physical memory
> >   - saving virt-io and other device emulation state
> >
> >
> > To save registers, the sysctl used in bhyvectl (vmmctl command
> > previously) is helpful,

> Maybe he meant ioctl.

> however, it's interface is no good because getting register value
> > cause a sysctl call
> > so to context-switch per one register, and for getting all registers,
> > it's not efficient.

> I think it's more preferable to make a struct to set or unset boolean
> > fields per register,
> > to tell which registers kernel should return, and kernel returns those
> > state with struct vmxctx.
> >
> > struct vmxctx is such.
> >  struct vmxctx {
> >         register_t      tmpstk[32];             /* vmx_return() stack */
> >         register_t      tmpstktop;
> >
> >         register_t      guest_rdi;              /* Guest state */
> >         register_t      guest_rsi;
> >                 :
> >                 :

It looks we don't really have to take care register values on VMCS on here,
just registers on vmxctx is enough(described below).
Then, how about to add vmxctx dumping ioctl?

> >
> >
> > And, considering memory dump, /dev/vmm/vmname is a file that is a map
> > of guest memory,
> > so memory dump doesn't seem hard, just stop vm, write back all guest
> > cache, and copy
> > memory file to a regular file.
> >
> > Finally, I don't know much about device state, but I think there must
> > some state to be saved, like
> > network stack.
> >
> > I'm not sure I wrote former, so I appreciate your ideas and suggestions.
> >
> I think that you are on the right track.
> A brute force way of figuring out all the state must be saved is to
> look at all the initialization functions that are called when a vm and
> a vcpu are created. So, this would be vm_create() and vcpu_init() in
> the kernel module.
> There is also the hardware assist state that is maintained by the
> processor (VT-x or SVM) and this includes things like guest
> interruptibility, guest run state etc. I am assuming that it would be
> sufficient to save the VMCS page after telling the processor to flush
> any state it may be caching on chip.

 I think, just dump whole VMCS page after calling VMCLEAR instruction is
easiest way to do this.
(I also considered to dump only necessary values on VMCS by VMREAD
instruction, but maybe it's easy to break guest state mistakenly, and we
don't get advantage by doing that way.)

Then maybe we need VMCS dumping ioctl here.

There is also emulated pci bus, virtio devices and legacy isa device
> state that would need to be saved by the userspace 'bhyve' process.

What is the necessary operation for virtio devices to suspend/resume?
Maybe dump all rings of the devices?
It doesn't have registers, right?

> And finally there is the matter of how to communicate with 'bhyve'
> process that it needs to suspend the virtual machine and write its
> state to disk - perhaps a signal would be good enough place to start.

How about this idea:
bhyvectl sends VM_SUSPEND ioctl.
If the guests is in VMX non-root mode, VM_SUSPEND ioctl handler sends IPI
to interrupt the guest thread.
Then the guest thread breaks vmx_run() loop, exit to userland with exitcode

Or, if the guests is not in VM_RUN ioctl but performing userland work(such
as running virtio host-side driver), maybe you just need to wait bhyve
process sends VM_RUN ioctl.
When bhyve sends VM_RUN ioctl, vmm.ko should not perform VMEnter.
It should just returns VM_EXITCODE_SUSPEND.

On both cases, vmm.ko returns VM_EXITCODE_SUSPEND at the end.
Then bhyve process can perform suspend action in VM_EXITCODE_SUSPEND

I think this is simple.

> This certainly sounds like an interesting and challenging project and
> we would be happy to help in any way we can.
> best
> Neel
> > Thanks, Iori.
> > _______________________________________________
> > freebsd-virtualization at mailing list
> >
> > To unsubscribe, send any mail to "
> freebsd-virtualization-unsubscribe at"
> _______________________________________________
> freebsd-virtualization at mailing list
> To unsubscribe, send any mail to "
> freebsd-virtualization-unsubscribe at"

More information about the freebsd-virtualization mailing list