Invalid ashift increase allowed by r253441

Steven Hartland killing at multiplay.co.uk
Thu Aug 1 09:56:50 UTC 2013


----- Original Message ----- 
From: "Xin Li" <delphij at delphij.net>
To: "Steven Hartland" <killing at multiplay.co.uk>
Cc: <d at delphij.net>; "Xin LI" <delphij at FreeBSD.org>; <zfs-devel at FreeBSD.org>
Sent: Thursday, August 01, 2013 9:46 AM
Subject: Re: Invalid ashift increase allowed by r253441


> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> On 8/1/13 1:24 AM, Steven Hartland wrote:
>> When I first thought about this I was in agreement that it really 
>> should deal with this case but then I thought about the overhead 
>> this would add for and every single IO request, and was would that
>> really be worth it?
>> 
>> Given the performance impact that is very evident when you use 
>> SSD's that lie about their sector size I'd have to say I don't 
>> think so.
> 
> I'm not sure if I have followed -- could you be more specific on what
> kind of overhead?  (Speaking for the escalated read or read before
> write, I think we just can't avoid it without recreating the pool,
> assuming an ashift=9 image is dd'ed into an ashift=12 storage.)
> 
> Or are you talking about something that I have overlooked?

I see two use cases for this:-
1. Disk quirks to make a 4K native disk report as such.
2. Manually copying a pool disk from one device to another with
   a larger sector size.

#1 is going to be the most common case and this should be dealt
with by correctly reporting the two values to ZFS, no other changes
are needed for this to work. The performance won't be optimal but
thats to be expected.

#2 is where you're going to need ZFS to do validation / conversion
on every IO to check if its needs alignment changes, escalate reads
or add read before write for small writes.

This is going to be petty rare and I don't believe the rareness
justifies slowing down for the common case (as you'd always
need to check every IO if this was supported, even if said
device didn't require it).

That said it may be possible add a transformation stage into
the zio pipeline on pool load for devices that require it, so
the impact to devices which don't require transformation would
be zero.

I don't think this will a simple task though so I believe the
current way forward should be:-
1. Revert the ashift check so it fails devices with larger
   ashift.
2. Add proper support for physical and logical sector sizes
   reporting for ashift.
3. Look at adding translation support for larger logical
   sector size devices than the current ashift.

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster at multiplay.co.uk.



More information about the zfs-devel mailing list