getting to 4K disk blocks in ZFS

Steven Hartland killing at multiplay.co.uk
Thu Sep 11 01:22:09 UTC 2014


----- Original Message ----- 
From: "Aristedes Maniatis" <ari at ish.com.au>
To: "Stefan Esser" <se at freebsd.org>; "freebsd-stable" <freebsd-stable at freebsd.org>
Sent: Thursday, September 11, 2014 1:45 AM
Subject: Re: getting to 4K disk blocks in ZFS


> Thanks Stefan and Peter for the highly informative posts.
>
> On 10/09/2014 5:48pm, Stefan Esser wrote:
>> ZFS uses variable block sizes by breaking down large blocks to smaller
>> fragments as suitable for the data to be stored. The largest block to
>> be used is configurable (128 KByte by default) and the smallest fragment
>> is the sector size (i.e. 512 or 4096 bytes), as configured by "ashift".
>
> So this means that the ZFS developers would need to effectively (re)fragment the entire pool if they wanted to develop a way to 
> increase the ashift size. This sounds like something that isn't going to be solved in the near future (less than three years) if 
> it is a similar technical problem to inserting another disk into an existing vdev.
>
> And that means that as it becomes harder to buy older 512 byte disks, everyone with a ZFS pool is going to be stuck with managing 
> quite a lot of downtime as they upgrade. And even more pain if they boot off that pool.
>
>
> On 10/09/2014 4:51pm, Peter Wemm wrote:
>> For what its worth, in the freebsd.org cluster we automatically align
>> everything to a minimum of 4k, no matter what the actual drive is.
>>
>> We set:  sysctl vfs.zfs.min_auto_ashift=12
>> (this saves a lot of messing around with gnop etc)
>>
>> and ensure all the gpt slices are 4k or better aligned.
>
> Should the FreeBSD project change this minimum in the next release?
> There seems to be no downside and a huge amount of pain for people
> who stumble along with the defaults not knowing what a mess they are
> creating to solve later.

The downside is wasted space which can be significant and hence when
I last suggested just this it was unfortunately rejected.

We still maintain a local patch to our source tree which does just
this because, as you've mentioned, we don't want the pain so its
easier to just run everything as 4k.

    Regards
    Steve 



More information about the freebsd-stable mailing list