Adding a different type of blockstore to Bhyve
Willem Jan Withagen
wjw at digiware.nl
Tue Jan 7 09:53:27 UTC 2020
On 30-12-2019 19:06, Willem Jan Withagen wrote:
> Hi,
>
> One of the ways to run backing blockstore with KVM/Qemu is thru
> the Ceph Rados Block Device (RBD).
> https://github.com/qemu/qemu/blame/master/block/rbd.c
>
> And is make it possible use as boot-image or other blockdevice. Where
> the virtual machine using this image can migrate to another Dom0 host.
>
> I've been working on Ceph for quite some time, and one of the ways to
> offer a block device on FreeBSD is with rbd-ggate.
> This works thru geom-gate and will give a /dev/ggate# device that is
> mapped to an image in a rados pool.
>
> And I not into migration for Bhyve, but I would like to integrate RBD
> into Bhyve as an alternative backing store....
>
> Something like:
> bhyve -s 1,virtio-blk,rbd:poolname/imagename[@snapshotname] \
> [:option1=value1[:option2=value2...]]
>
> So started browsing the bhyve code, and end up in block_if.{hc}.
> But code there is rather strongly targeted towards a local
> filesystem storage....
>
> I also ran into net_backends.{ch}, and I guess it would be a nicer
> solution to create a block_backends.{ch} as well for interfacing to
> more than just one blockstore provider.
> And then load the RBD provider in the chain of blockstore providers.
> That way would it even be possible to make that code dl-loadable in
> case the LGPL ceph code is not directly importable in the usr.sbin
> tree. (Which I suspect it is)
>
> The alternative is to start using the /dev/ggate# devices but then
> we probably lose the option of live migration.
> And performance takes a serious hit:
> A block write/read would go from the vm kernel
> to the bhyve process in userspace.
> Then it would go to/dev/ggate# and again end up in the kernel
> only to have geom-gate send it back to userspace
> where rbd-ggate sends it to the cluster.
>
> Just typing this data flow is a lot of steps, showing that this
> might not be the best architecture.
>
> So the questions are:
> 1) Is the abstraction of block_backends.{ch} the way to go?
> 1.1) And would the extra indirection there be acceptable?
> (For network devices it seems no problem)
>
> 2) Does anybody already have such a framework for blockdevs?
> (Otherwise I'll try to morph the net_backends.{ch}
>
> 3) Other suggestions I need to consider?
Looking for reviewers of:
https://reviews.freebsd.org/D23010
In the days after newyear I made a first attempt to refactor the
block_if stuff into a generic backend: blockbe_ and and implementation
of the local storage: lockblk_
I've submitted it to phabricator, and I'm seeking reviews and ultimately
somebody that will commit this when all issues are worked out.
--WjW
More information about the freebsd-virtualization
mailing list