fdisk(8) vs gpart(8), and gnop
Matthew Ahrens
mahrens at delphix.com
Sun Jun 1 21:27:44 UTC 2014
On Sun, Jun 1, 2014 at 9:07 AM, Nathan Whitehorn <nwhitehorn at freebsd.org>
wrote:
> On 06/01/14 09:00, Steven Hartland wrote:
>
>>
>> ----- Original Message ----- From: "Nathan Whitehorn" <
>> nwhitehorn at freebsd.org>
>> To: <freebsd-hackers at freebsd.org>; <freebsd-fs at freebsd.org>
>> Sent: Sunday, June 01, 2014 4:55 PM
>> Subject: Re: fdisk(8) vs gpart(8), and gnop
>>
>>
>> On 06/01/14 08:52, Steven Hartland wrote:
>>>
>>>> ----- Original Message ----- From: "Mark Felder" <feld at freebsd.org>
>>>>
>>>> On May 31, 2014, at 20:57, Freddie Cash <fjwcash at gmail.com> wrote:
>>>>>
>>>>> There's a sysctl where you can set the minimum ashift for zfs. Then
>>>>>> you
>>>>>> never need to use gnop.
>>>>>>
>>>>>> I believe it's part of 10.0?
>>>>>>
>>>>>
>>>>> I've not seen this yet. What we need is to port the ability to set
>>>>> ashift at pool creation time:
>>>>>
>>>>> $ zpool create -o ashift=12 tank mirror disk1 disk2 mirror disk3 disk4
>>>>>
>>>>> I believe the Linux zfs port has this functionality now, but we still
>>>>> do not.
>>>>>
>>>>
>>>> We don't have that direct option yet but you can achieve the
>>>> same thing by setting: vfs.zfs.min_auto_ashift=12
>>>>
>>>> Does anyone have any objections to me changing this default, right
>>> now, today?
>>> -Nathan
>>>
>>
>> I think you will get some objections to that, as it can have quite an
>> impact
>> on the performance for disks which are 512, due to the increased overhead
>> of
>> transfering 4k when only 512 is really required. This has a more dramatic
>> impact on RAIDZx due too.
>>
>> Personally we run a custom kernel on our machines which has just this
>> change
>> in it to ensure capability with future disks, so I can confirm it does
>> indeed
>> have the desired effect :)
>>
>
> So the discussion here is related to what to do about the installer. The
> current ZFS component unconditionally creates gnops all over the place to
> set ashift to 4k. That's across the board worse: it has exactly the
> performance impact of changing the default of this sysctl (whatever that
> is), it can't easily be overridden (which the sysctl can), and it's a
> horrible hack to boot. There are a few options:
>
> 1. Change the default of vfs.zfs.min_auto_ashift
>
This is probably a bad idea -- as others have mentioned, it can drastically
impact space usage and performance on 512B disks, especially when using
small ZFS blocks (e.g. for databases or VDI) and/or RAID-Z. That said, it
could be a reasonable default for specialized distros that are not used for
these workloads (maybe FreeNAS or PCBSD?).
2. Have the same effect but in a vastly worse way by adjusting the
> installer to create gnops
> 3. Have ZFS choose by itself and decide to do that permanently.
>
If the device reports a 512B sector size, it would be great for ZFS to
assume the device could be lying, and automatically determine the minimum
ashift which gives good performance. I think this could be done reasonably
well for the common case by doing the following when each 512B-sector
device is added:
1. do random 4KB writes to the disk to determine wIOPS at 4K
2. do random 3.5KB writes to the disk to determine wIOPS at 3.5K
If wIOPS at 4K > wIOPS at 3.5K, assume 4KB sectors, otherwise assume 512B
sectors. (Note: I haven't tried this in practice; we will need to test it
out and perhaps make some tweaks.)
I don't have the time or hardware to implement and test this, but I'd be
happy to mentor or code review.
--matt
>
> Our ATA code is good about reporting block sizes now, so (3) isn't a big
> issue except for the mixed-pool case, which is a huge PITA.
>
> We need to choose one of these. I favor (1).
> -Nathan
>
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
>
More information about the freebsd-fs
mailing list