bhyve win-guest benchmark comparing

Dustin Marquess dmarquess at gmail.com
Tue Oct 30 01:10:12 UTC 2018


It would be interesting to test running it under Xen with FreeBSD as the
dom0.

-Dustin

On Sat, Oct 27, 2018 at 1:04 PM Harry Schmalzbauer <freebsd at omnilan.de>
wrote:

> Am 22.10.2018 um 13:26 schrieb Harry Schmalzbauer:
>> > Test-Runs:
> > Each hypervisor had only the one bench-guest running, no other
> > tasks/guests were running besides system's native standard processes.
> > Since the time between powering up the guest and finishing logon
> > differed notably (~5s vs. ~20s) from one host to the other, I did a
> > quick synthetic IO-Test beforehand.
> > I'm using IOmeter since heise.de published a great test pattern called
> > IOmix – about 18 years ago I guess.  This access pattern has always
> > perfectly reflected the system performance for human computer usage
> > with non-caculation-centric applications, and still is my favourite,
> > despite throughput and latency changed by some orders of manitudes
> > during the last decade (and I had defined something for "fio" which
> > mimics IOmix and shows reasonable relational results; but I'm still
> > prefering IOmeter for homogenous IO benchmarking).
> >
> > The results is about factor 7 :-(
> > ~3800iops&69MB/s (CPU-guest-usage 42%IOmeter+12%irq)
> >                 vs.
> > ~29000iops&530MB/s (CPU-guest-usage 11%IOmeter+19%irq)
> >
> >
> >     [with debug kernel and debug-malloc, numbers are 3000iops&56MB/s,
> >      virtio-blk instead of ahci,hd: results in 5660iops&104MB/s with
> > non-debug kernel
> >      – much better, but even higher CPU load and still factor 4 slower]
> >
> > What I don't understand is, why the IOmeter process differs that much
> > in CPU utilization!?!  It's the same binary on the same OS (guest)
> > with the same OS-driver and the same underlying hardware – "just" the
> > AHCI emulation and the vmm differ...
> >
> > Unfortunately, the picture for virtio-net vs. vmxnet3 is similar sad.
> > Copying a single 5GB file from CIFS share to DB-ssd results in 100%
> > guest-CPU usage, where 40% are irqs and the throughput max out at
> > ~40MB/s.
> > When copying the same file from the same source with the same guest on
> > the same host but host booted ESXi, there's 20% guest-CPU usage while
> > transfering 111MB/s – the uplink GbE limit.
> >
> > These synthetic benchmark very well explain the "feelable" difference
> > when using a guest between the two hypervisors, but
>>
> To add an additional and rather surprinsing result, at least for me:
>
> Virtualbox provides
> 'VBoxManage internalcommands createrawvmdk -filename
> "testbench_da0.vmdk" -rawdisk /dev/da0'
>
> So I could use the exactly same test setup as for ESXi and bhyve.
> FreeBSD-Virtualbox (running on the same host installation like bhyve)
> performed quiet well, although it doesn't survive IOmix benchmark run
> when the "testbench_da0.vmdk" (the "raw" SSD-R0-array) is hooked up to
> the SATA controller.
> But connected to the emulated SAS controller(LSI1068), it runs without
> problems and results in 9600iops at 185MB/s with 1%IOmeter+7%irq CPU
> utilization (yes, 1% vs. 42% for IOmeter load).
> Still far away from what ESXi provides, but almost double performance of
> virtio-blk with bhyve, and most important, much less load (host and
> guest show exactly the same low values as opposed to the very high loads
> which are shown on host and guest with bhyve:virtio-blk).
> The HDtune random access benchmark also shows the factor 2, linear over
> all block sizes.
>
> Virtualbox's virtio-net setup gives ~100MB/s with peaks at 111 and ~40%
> CPU load.
> Guest uses the same driver like with bhyve:virtio-blk, while backend of
> virtualbox:virtio-net is vboxnetflt utilizing netgraph and vboxnetadp.ko
> vs. tap(4).
> So not only the IO efficiency (lower throughput but also much lower CPU
> utilization) is remarbably better, but also the network performance.
> Even low-bandwidth RDP sessions via GbE-LAN suffer from micro hangs
> under bhyve and virtio-net.  And 40MB/s transfers cause 100% CPU load on
> bhyve – both runs had exactly the same WIndows virtio-net driver in use
> (RedHat 141).
>
> Conclusion: Virtualbox vs. ESXi shows a 0.5% efficiency factor, while
> bhyve vs. ESXi shows 0.25% overall efficiency factor.
> I tried to provide a test environment with shortest hardware paths
> possible.  At least the benchmarks were run 100% reproducable with the
> same binaries.
>
> So I'm really interested if
>> > Are these (emulation(only?) related, I guess) performace issues well
> > known?  I mean, does somebody know what needs to be done in what area,
> > in order to catch up with the other results? So it's just a matter of
> > time/resources?
> > Or are these results surprising and extensive analysis must be done
> > before anybody can tell how to fix the IO limitations?
> >
> > Is the root cause for the problematic low virtio-net throughput
> > probably the same as for the disk IO limits?  Both really hurt in my
> > use case and the host is not idling in relation, but even showing
> > higher load with lower results.  So even if the lower
> > user-experience-performance would be considered as toleratable, the
> > guests/host ratio was only half dense.
>
> Thanks,
>
> -harry
>
> _______________________________________________
> freebsd-virtualization at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
> To unsubscribe, send any mail to "
> freebsd-virtualization-unsubscribe at freebsd.org"
>


More information about the freebsd-virtualization mailing list