Re: BHYVE SNAPSHOT image format proposal

From: Tomek CEDRO <tomek_at_cedro.info>
Date: Tue, 23 May 2023 18:58:27 UTC
On Tue, May 23, 2023 at 6:06 PM Vitaliy Gusev wrote:
> Hi,
> Here is a proposal for bhyve snapshot/checkpoint image format improvements.
> It implies moving snapshot code to nvlist engine.

Hey there Vitaliy :-) bhyve getting more and more traction, I am new
user of bhyve and no expert, but new and missing features are welcome
I guess.. there was a discussion on the mailing lists recently on
better snapshots mechanism :-)


> Current snapshot implementation has disadvantages:
> 3 files per snapshot: .meta, .kern, vram

No problem, unless new single file will be protected against
corruption (filesystem, transfer, application crash) and possible to
be easily and cheaply modified in place?

> Binary Stream format of data.

This is small and fast? Will new format too?

> Adding  optional variable - breaks resume
> Removing variable - breaks resume
> Changing saved order of variables - breaks resume

Obviously need improvement :-)

> Hard to get information about what is saved and decode.
> Hard to debug if somethings goes wrong

Additional tools missing? Will new format allow text editor interaction?

> No versions. If change code, resume of an old images can be
> passed, but with UB.

Is new format future proof and provides backward compatibility?


> New nvlist implementation should solve all things above. The first step -
> improve snapshot/checkpoint saving format. It eliminates three files usage
> per a snapshot.
>
> 1. BHYVE SNAPSHOT image format:
>
> +———————————————————————————————————————+
> |      HEADER PHYS - 4096 BYTES         |
> +———————————————————————————————————————+
> |                                       |
> |                DATA                   |
> |                                       |
> +———————————————————————————————————————+
> (..)
>     Predefined sections:  “config”, “devices”, “kernel”, “memory”.
> 4. EXAMPLE:
>  IDENT STRING:
>        "BHYVE CHECKPOINT IMAGE VERSION 1"
>  NVLIST HEADER:
>   [config]
>         config.offset = 0x1000 (4096)
>         config.size = 0x1f6 (502)
>         config.type = "text"
>   [kernel]
>         kernel.offset = 0x11f6 (4598)
>         kernel.size = 0x19a7 (6567)
>         kernel.type = “nvlist"
> (..)

So this will be new text config based format with variable = value and sections?

How much bigger will be the overal file size increase?

How much longer it will take do decode/encode/process files?

What is the possibility of format change and backward/foward compatibility?

Have you considered efficiency comparison of current format, proposed
format, and maybe using SQLITE or JSON storage/parsers?  For instance
sqlite would be blazingly fast but hard to migrate. json would be most
versatile but more time/memory consuming?

Maybe EFL approach of storing configuration files for limited
resources embedded system storage that use binary storage data but can
be decompressed in chunks that can be replaced in place?
https://www.enlightenment.org/develop/efl/start

Sorry for asking those questions but there may be already good and
verified solutions out there not to reinvent the wheel? :-)

--
CeDeROM, SQ7MHZ, http://www.tomek.cedro.info