bhyve and vfs.zfs.arc_max, and zfs tuning for a hypervisor

Matt Churchyard matt.churchyard at userve.net
Thu Mar 21 10:24:23 UTC 2019


> > 
> > > 1. Does ARC actually cache zfs volumes (not files/datasets)?
> > 
> > Yes it does.
> 
> I find this distinction between volumes/files/etc and what is cached 
> causes confusion (as well as "volumes not datasets").
> 
> Both ZVOLs and Z file systems are types of dataset. A dataset stores 
> data in records (usually up to 128kb in size).  It's these records 
> that are cached (and that most ZFS functions such as 
> compression/raidz/zil/etc work with)
> 
> As far as the ZFS lower levels are concerned, there is no difference 
> between a volume and a file system.

>Thank you Matt, this was very instructive.

> > > 2. If ARC does cache volumes, does this cache make sense on a 
> > > hypervisor, because guest OSes will probably have their own disk cache anyway.
> > 
> > IMHO not much, because the guest OS is relying on the fact that when 
> > it writes it’s own cached data out to „disk“, it will be committed 
> > to stable storage.
> 
> Maybe I've missed something but I don't quite get the link between 
> read cache (ARC) and guest writes here?

>Maybe there was a confusion between read and write caches, but my question still stands:

>Does it make sense to cache the same data (for reading too) twice: one time in host's RAM (ZFS ARC) and the other time in guest's RAM (whatever fs the guest uses, all modern OSes have disk caches)?

>What do VMWare or VirtualBox do for this situation? Do they ever cache their volumes in the hypervisor's RAM?

Virtualbox would be no different to bhyve in that it doesn't care what storage system you are using or how it is configured, that's up to the system admin.
I believe VMFS is more akin to other "traditional" file systems, and doesn't do RAM caching to anywhere near the extent of ZFS. I do think you can use SSD/NVMe/etc as cache in VMWare.

My initial instinct would be to keep cache on, but reduce the limit to allocate the majority of the RAM for guests. (I'd still want at least 4GB as an absolute minimum though, probably more on systems with 100GB+ total). Of course you could probably test with cache set to all/metadata and see what effect it has. Adding L2ARC may be useful if the main pool is spinning disks, but then I've heard there's a rule of thumb for requiring X amount of ARC for Y amount of L2ARC, but I'm not sure what that rule is.

I'd also be intrigued to know what the logic in FreeNAS is for it. It is simply a case of "(arc = total_ram - guest_allocated)"?
Is there a lower limit based on a percentage or total RAM, and/or a hard lower limit?

Matt



More information about the freebsd-virtualization mailing list