ZVOLs volmode/sync performance influence – affecting windows guests via FC RDM.vmdk
Harry Schmalzbauer
freebsd at omnilan.de
Fri Oct 4 17:17:04 UTC 2019
Hello,
I noticed a significant guest write performance drop with volmode=dev
during my 12.1 fibre channel tests.
I remember having heard of such reports by some people oaccasionally
during the last years, so I decided to see how far I can track it down.
Unfortunately, I found no way to demonstrate the effect with in-box
tools, even not utilizing fio(1) (from ports/benchmarks/fio).
Since I don't know how zvols/ctl work under the hood, I'd need help from
the experts, how/why volmode seems to affect sync property/behaviour.
The numbers I see let me think that setting volmode=geom will cause the
same ZFS _zvol_-behaviour as setting the sync property to "disabled".
Why? Shortest summary: Performance on Windows guests writing files onto
a NTFS filesystem drops by factor ~8 with volmode=dev, but
· After setting sync=disabled with vomode=dev ZVOLs, I see the same
write rate as I get with volmode=geom.
· Also, disabling write cache flush on windows has exactly the same
effect, while leaving sync=standard.
Here's a little more background information.
The windows guest uses the zvol-backed-FC-target as mapped raw device
from a virtual SCSI controller
(ZVOL->ctl(4)->isp(4)->qlnativefc(ESXi-Initiator)->RDM.vmdk->paravirt-SCSI->\\.\PhysicalDrive1->GPT...NTFS)
The initiator is ESXi6.7, but I'm quiet sure I saw the same effect with
iSCSI (windows software iSCSI initiator) instead of FC some time ago,
while I haven't falsified this run.
Here's what I've done trying to reproduce the issue, leaving
windows/ESXi out of the game:
I'm creating a ZVOL block backend for ctl(4):
zfs create -V 10G -o compression=off -o volmode=geom -o
sync=standard MyPool/testvol
ctladm create -b block -d guest-zvol -o file=/dev/zvol/MyPool/testvol
The first line creates the ZVOL with default values. If the pool or
parent dataset hasn't set local values for the compression, volmode or
snyc properties, defining the 3 "-o"s can be omitted.
ctladm port -p `ctladm port -l | grep "camsim.*naa" | cut -w -f 1`
-o on
Now I have a "FREEBSD CTLDISK 0001", available as geom "daN".
To simulate even better, I'm using the second isp(4) port as initiator
(to be precise, I use 2 ports in simultanious target/initiator role, so
I have the ZVOL backed block device available with and without real FC
link in the path)
Utilizing dd(1) on the 'da' connected to the FC-initiator, I get
_exactly_ the same numbers as I get in my windows guest along all the
different block sizes!!!
E.g., for the 1k test, I'm running
dd if=/dev/zero bs=1k of=/dev/da11 count=100k status=progress (~8MB/s)
For those wanting to follow the experiment – remove the "volmode=geom"-zvol:
ctladm port -p `ctladm port -l | grep "camsim.*naa" | cut -w -f 1`
-o off
ctladm remove -b block -l 0 (<– only if you don't have LUN 0 in
use otherwise)
zfs destroy MyPool/testvol
"volmode" property can be altered at runtime, but won't have any
effect! Either you would have to reboot or re-import the pool.
For my test I can simply create a new, identical ZVOL, this time with
volmode=dev (instead of geom like before).
zfs create -V 10G -o compression=off -o volmode=dev -o
sync=standard MyPool/testvol
ctladm create -b block -d guest-zvol -o file=/dev/zvol/MyPool/testvol
ctladm port -p `ctladm port -l | grep "camsim.*naa" | cut -w -f 1`
-o on
Now the same Windows filesystem write test drops throughput rate by
factor 8 up for .5-32k block sizes and still about factor 3 for larger
block sizes.
(at this point you'll most likely have noticed a panic with 12.1-BETA3;
see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240917 )
Unfortunately, I can't see any performance drop with the dd line from above.
Since fio(8) has an parameter to issue fsync(3) any N written blocks, I
also tried to reproduce with:
echo "[noop]" | fio --ioengine=sync --filename=/dev/da11 --bs=1k
--rw=write --io_size=80m --fsync=1 -
To my surprise, I still do _not_ see any performance drop, while I
reproducably see the big factor 8 penalty on the windows guest.
Can anybody tell me, which part I'm missing to simulate the real-world
issue?
Like mentioned, either disabling disk's write chache flush in windows,
or alternatively setting sync=disabled restore the windows write
throughput to the same numbers as with volmode=geom.
fio(1) has the not usable ioengine "sg", which I know nothing about.
Maybe somebody has any hint in that direction?
Thanks
-harry
More information about the freebsd-stable
mailing list