followup storage question

Fri Sep 11 14:24:26 UTC 2015

On 09/11/15 09:23, Chad J. Milios wrote:
>> On Sep 11, 2015, at 8:59 AM, William A. Mahaffey III <wam at hiwaay.net> wrote:
>>
>>
>>
>> The Wiki page https://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/9.0-RELEASE illustrates using gnop to enforce 4K alignment of gpt partitions for subsequent use by ZFS. However the gpart commands also use the '-a 4k' arguments, aligning partitions on 4k boundaries as I understand things. Is the gnop command also necessary ? TIA & have a nice weekend.
>>
>>
>> -- 
>>
>>     William A. Mahaffey III
> Yes, handling separately both facets of the same underlying issue is necessary. Those facets being the partition's alignment upon the outer device and the partition's block size that the device node reports to ZFS.
>
> The latter can be done a different way, effectively, in later versions of FreeBSD there is a sysctl, vfs.zfs.min_auto_ashift which you can set to 12 for 4096 byte blocks or 9 for the default 512 bytes. (The ashift value is the exponent over the number 2 to get the number of bytes in a block.)
>
> The old gnop way still works just fine so I still use that method, personally. This definitely only has to be done when vdev(s) are added/created/replaced* on the pool, not on every mount/import, by then ZFS clearly listens to the formatting metadata it stamped on the vdev instead of what the ioctls of the device node say and so will always write larger and correctly aligned blocks. (I'm not sure the reverse direction, not a typical use, if it holds true without gnop every time, and I know the min_auto_ashift won't help there, being if for some reason you intend gnop for simulating smaller blocks to ZFS from larger device node blocks, say you wanted to allow a certain amount of write amplification for more efficiently storing lots of small files/directories/metadata. In that case you may need to enable the gnop every time. I'm not sure because I don't run any pools that may but I know you can if you want for that reason, space overhead. It'd take some testing and actual measurement for me to confidently decide gnop can be subsequently skipped after the vdev initialization if going in that opposite direction was your goal. Maybe someone chimes in here to let us know for sure. At any rate, gnop is by its nature just about the fastest and lightest geom class under the sun and I believe you can keep running thousands of instances busily in production and see no noticeable overhead.)
>
> *Yes, mind the gnop or sysctl for ashift whenever replacing as well, it's a vdev property not copied as part of the data resilvering, it's decided by ZFS for each vdev independently even though having mixed pools seems totally unintuitive. I've seen where it's been forgotten at replace time. Then when you do use it, it's sort of a pain to get gnop/ZFS to relinquish the vdev if you do an online replace and then want to try to clear off the gnop mode. I'd just leave it on there and upon reboot it'll disappear and ZFS will pick up the real vdev and properly do what you want with it. There should be no problem with years of uptime in the meantime and then coming up slightly differently on next boot bypassing gnop and with all correct ashift.

Excellent, clear as a bell :-). Thanks.

-- 

	William A. Mahaffey III

  ----------------------------------------------------------------------

	"The M1 Garand is without doubt the finest implement of war
	 ever devised by man."
                            -- Gen. George S. Patton Jr.