FreeBSD ZFS file server with SSD HDD
Frank Leonhardt (m)
frank2 at fjl.co.uk
Fri Oct 13 17:22:44 UTC 2017
On 13 October 2017 14:43:16 BST, Kate Dawson <k4t at 3msg.es> wrote:
>On Fri, Oct 13, 2017 at 01:04:50PM +0100, Frank Leonhardt (m) wrote:
>> This all matters A LOT if you're using ZFS to back a virtual HD for a
>VM. Things lke vSphere make every NFS write synchronous. Given the
>guest OS is probably using a file format that makes this pointless adds
>insult to injury. ZFS writing every block to a CoW file will fragment
>it all to hell and back.
>> So, throwing hardware at it isn't going to solve the underlying
>problem. You need to sort out the sync writes. If your file store is on
>a UPS, ignore them (I comment out the code). And store your virtual HD
>on UFS if possible.
>sync is disabled on the dataset, I think it's all async. The VM's all
>have journalled files systems, and the the system is UPS backed.
>NFS is mounted async - I think that is the default for Debian Linux
>My understanding is that ZFS will always be consistent, however in the
>case of catastrophic shutdown some data may not be written to disk, and
>we will just have to take our chances.
>A key reason for ZFS being chosen was snapshots. Otherwise I think
>GNU/Linux would have been chosen over FreeBSD and ZFS
>Thanks for the detailed info on ZIL/SLOG
My reply was supposed to go to questions but Android mail trouble!
Yes, your understanding of ZFS is correct. And also misleading. The file structure will be consistent and have no write holes. But this isn't true at the application level. Suppose you have two files. The application wants to write a corresponding entry in both. If the power fails after the first write, one file will be fully updated, and the second will be fully NOT updated. Its a matter of good application design as to what happens when the power is restored.
At least ZFS guarantees its structure is correct.
No amount of OS cleverness on fancy hardware is going to save you from a sloppy application.
You said you were running VMs using Xen. I use Xen out of preference, but I must admit I've never checked to see if all its writes are sync. When I want performance, I don't use a VM on the first place. But let's assume it does. If you have Windows running it won't like any interruption to its disk activity. The backing file may be consistent, but its contents won't be. But that's a hit from using Windows however you do it.
You mentioned you had disabled sync in the dataset. I bet you hardly noticed a difference, right? It has three modes; sync nothing, sync everything and sync when asked. It's not that clear that the default is to only sync when asked, and I'd be inclined to keep it that way.
Your problem is more likely NFS. Your client asks for a sync for every write because it's lazy. This request goes all the way over the network to ZFS, which probably deals with it fairly quickly. However the client has to wait for the confirmation to get all the way back to its OS before it returns from the system call, only to get another block to write and start the journey again. That's POSIX for you. In application programmers only did a sync write when necessary it'd be fine. But they don't.
There is a tuneable to tell NFS to lie, if you don't care about POSIX compliance. Unfortunately I'm on a train and can't remember the detail, but its somewhere on my blog (I think). You'll find it improves matters around 300%.
Sent from my Android device with K-9 Mail. Please excuse my brevity.
More information about the freebsd-questions