bhyve: corrupting zfs pools?

Andriy Gapon avg at FreeBSD.org
Tue Jun 2 16:46:08 UTC 2015


On 02/06/2015 14:14, Andriy Gapon wrote:
> 
> I am doing a simple experiment.
> 
> I get FreeBSD image from here:
> ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/11.0/FreeBSD-11.0-CURRENT-amd64-r283577-20150526-memstick.img.xz
> 
> Then I run in bhyve with two additional "disks" created with truncate -s 4g:
> $ bhyveload -m 1G -d
> ~/tmp/FreeBSD-11.0-CURRENT-amd64-r283577-20150526-memstick.img test
> $ bhyve -A -HP -s 0:0,hostbridge -s 1,lpc -s 2:0,virtio-net,tap0 -s
> 3:0,virtio-blk,/home/avg/tmp/FreeBSD-11.0-CURRENT-amd64-r283577-20150526-memstick.img
> -s 3:1,virtio-blk,/tmp/l2arc-test/hdd1,sectorsize=512/4096 -s
> 3:2,virtio-blk,/tmp/l2arc-test/hdd2,sectorsize=512/4096 -l com1,stdio -l
> com2,/dev/nmdm0A -c 2 -m 1g test
> 
> Note sectorsize=512/4096 options.  Not sure if it's them that cause the trouble.
> 
> Then, in the VM:
> $ zpool create l2arc-test mirror /dev/vtbd1 /dev/vtbd2
> $ zfs create -p l2arc-test/ROOT/initial
> $ tar -c --one-file-system -f - / | tar -x -C /l2arc-test/ROOT/initial -f -
> 
> Afterwards, zpool status -v reports no problem.
> But then I run zpool scrub and get the following in the end:
> $ zpool status -v
>   pool: l2arc-test
>  state: ONLINE
> status: One or more devices has experienced an error resulting in data
>         corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
>         entire pool from backup.
>    see: http://illumos.org/msg/ZFS-8000-8A
>   scan: scrub repaired 356K in 0h0m with 9 errors on Tue Jun  2 13:58:17 2015
> config:
> 
>         NAME        STATE     READ WRITE CKSUM
>         l2arc-test  ONLINE       0     0     9
>           mirror-0  ONLINE       0     0    18
>             vtbd1   ONLINE       0     0    25
>             vtbd2   ONLINE       0     0    23
> 
> errors: Permanent errors have been detected in the following files:
> 
>         /l2arc-test/ROOT/initial/usr/bin/svnlitesync
>         /l2arc-test/ROOT/initial/usr/freebsd-dist/kernel.txz
>         /l2arc-test/ROOT/initial/usr/freebsd-dist/src.txz
> 
> /l2arc-test/ROOT/initial/usr/lib/clang/3.6.1/lib/freebsd/libclang_rt.asan-x86_64.a
> 
> 
> The same issue is reproducible with ahci-hd.
> 
> My host system is a recent amd64 CURRENT as well.  The hardware platform is AMD.
> 

I used the following monstrous command line to reproduce the test in qemu:
$ qemu-system-x86_64 -smp 2 -m 1024 -drive
file=/tmp/livecd2/R2.img,format=raw,if=none,id=bootd -device
virtio-blk-pci,drive=bootd -drive
file=/tmp/l2arc-test/hdd1,if=none,id=hdd1,format=raw -device
virtio-blk-pci,drive=hdd1,logical_block_size=4096 -drive
file=/tmp/l2arc-test/hdd2,id=hdd2,if=none,format=raw -device
virtio-blk-pci,drive=hdd2,logical_block_size=4096 -drive
file=/tmp/l2arc-test/ssd,id=ssd,if=none,format=raw -device
virtio-blk-pci,drive=ssd,logical_block_size=4096 ...

And several other variations of logical_block_size and physical_block_size.
The tests a re very slow, but there are no checksum errors.

So, I suspect guest memory corruption caused by bhyve.  Perhaps the problem is
indeed specific to AMD-V.

-- 
Andriy Gapon


More information about the freebsd-virtualization mailing list