ZVOLs volmode/sync performance influence – affecting windows guests via FC RDM.vmdk

Harry Schmalzbauer freebsd at omnilan.de
Fri Oct 4 17:17:04 UTC 2019


Hello,

I noticed a significant guest write performance drop with volmode=dev 
during my 12.1 fibre channel tests.
I remember having heard of such reports by some people oaccasionally 
during the last years, so I decided to see how far I can track it down.
Unfortunately, I found no way to demonstrate the effect with in-box 
tools, even not utilizing fio(1) (from ports/benchmarks/fio).

Since I don't know how zvols/ctl work under the hood, I'd need help from 
the experts, how/why volmode seems to affect sync property/behaviour.

The numbers I see let me think that setting volmode=geom will cause the 
same ZFS _zvol_-behaviour as setting the sync property to "disabled".
Why? Shortest summary: Performance on Windows guests writing files onto 
a NTFS filesystem drops by factor ~8 with volmode=dev, but
· After setting sync=disabled with vomode=dev ZVOLs, I see the same 
write rate as I get with volmode=geom.
· Also, disabling write cache flush on windows has exactly the same 
effect, while leaving sync=standard.

Here's a little more background information.

The windows guest uses the zvol-backed-FC-target as mapped raw device 
from a virtual SCSI controller 
(ZVOL->ctl(4)->isp(4)->qlnativefc(ESXi-Initiator)->RDM.vmdk->paravirt-SCSI->\\.\PhysicalDrive1->GPT...NTFS)
The initiator is ESXi6.7, but I'm quiet sure I saw the same effect with 
iSCSI (windows software iSCSI initiator) instead of FC some time ago, 
while I haven't falsified this run.


Here's what I've done trying to reproduce the issue, leaving 
windows/ESXi out of the game:

I'm creating a ZVOL block backend for ctl(4):
     zfs create -V 10G -o compression=off -o volmode=geom -o 
sync=standard MyPool/testvol
     ctladm create -b block -d guest-zvol -o file=/dev/zvol/MyPool/testvol

The first line creates the ZVOL with default values.  If the pool or 
parent dataset hasn't set local values for the compression, volmode or 
snyc properties, defining the 3 "-o"s can be omitted.

     ctladm port -p `ctladm port -l | grep "camsim.*naa" | cut -w -f 1` 
-o on

Now I have a "FREEBSD CTLDISK 0001", available as geom "daN".

To simulate even better, I'm using the second isp(4) port as initiator 
(to be precise, I use 2 ports in simultanious target/initiator role, so 
I have the ZVOL backed block device available with and without real FC 
link in the path)
Utilizing dd(1) on the 'da' connected to the FC-initiator, I get 
_exactly_ the same numbers as I get in my windows guest along all the 
different block sizes!!!
E.g., for the 1k test, I'm running
     dd if=/dev/zero bs=1k of=/dev/da11 count=100k status=progress (~8MB/s)

For those wanting to follow the experiment – remove the "volmode=geom"-zvol:
     ctladm port -p `ctladm port -l | grep "camsim.*naa" | cut -w -f 1` 
-o off
     ctladm remove -b block -l 0  (<– only if you don't have LUN 0 in 
use otherwise)
     zfs destroy MyPool/testvol

"volmode" property can be altered at runtime, but won't have any 
effect!  Either you would have to reboot or re-import the pool.
For my test I can simply create a new, identical ZVOL, this time with 
volmode=dev (instead of geom like before).
     zfs create -V 10G -o compression=off -o volmode=dev -o 
sync=standard MyPool/testvol
     ctladm create -b block -d guest-zvol -o file=/dev/zvol/MyPool/testvol
     ctladm port -p `ctladm port -l | grep "camsim.*naa" | cut -w -f 1` 
-o on

Now the same Windows filesystem write test drops throughput rate by 
factor 8 up for .5-32k block sizes and still about factor 3 for larger 
block sizes.

(at this point you'll most likely have noticed a panic with 12.1-BETA3; 
see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240917 )

Unfortunately, I can't see any performance drop with the dd line from above.
Since fio(8) has an parameter to issue fsync(3) any N written blocks, I 
also tried to reproduce with:
     echo "[noop]" | fio --ioengine=sync --filename=/dev/da11 --bs=1k 
--rw=write --io_size=80m --fsync=1 -
To my surprise, I still do _not_ see any performance drop, while I 
reproducably see the big factor 8 penalty on the windows guest.

Can anybody tell me, which part I'm missing to simulate the real-world 
issue?
Like mentioned, either disabling disk's write chache flush in windows, 
or alternatively setting sync=disabled restore the windows write 
throughput to the same numbers as with volmode=geom.

fio(1) has the not usable ioengine "sg", which I know nothing about.  
Maybe somebody has any hint in that direction?

Thanks

-harry




More information about the freebsd-stable mailing list