Invalid ashift increase allowed by r253441

Steven Hartland killing at multiplay.co.uk
Thu Aug 1 08:23:35 UTC 2013


----- Original Message ----- 
From: "Xin Li" <delphij at delphij.net>
> Hi, Steven,
> 
> On 7/30/13 5:16 PM, Steven Hartland wrote:
> [...]
>>> For example on a native 4k sectorsize disk its not possible to
>>> directly access the bytes at offset 512 instead you would need to
>>> read offset 0 and scan in 512 bytes to the returned data block. I
>>> didn't think geom / zfs dealt with this case?
> 
> No, GEOM does not allow this.
> 
>>> The only case I can see this happening is if an old 512b
>>> sectorsize disk was dd'ed to a new 4k sectorsize one but thats
>>> effectively what this change is allowing.
> 
> I think your analysis is correct.  However, in my opinion, ZFS should
> make sure that the access is aligned when it issues the operation to
> the underlying GEOM provider, and banning import of pools with larger
> ashift is just papering problems out, which is not desirable.
> 
> [...]
>> I suspect this came about because of changes to how ashift was
>> being reported / faked, very simular to the patches many of us have
>> for FreeBSD.
>> 
>> If I'm correct this strengthens my view that this change is
>> incorrect, as what was trying to be done was support for faked
>> sector size on devices such as SSD's and HD's which report and
>> importantly support 512b sectors even if they are reported to ZFS
>> as 4k. This is very different from supporting a device which ONLY
>> supports a large sector size and hence smaller accesses would fail
>> which this change allows.
> 
> Well, we actually need to answer two questions, not just one:
> 
> a) should a pool be importable when ashift increases?
> 
> I think the answer is "Yes" here, which is done by r253441.  No, this
> is not optimal, but it does not really hurt anything if we make ZFS do
> the right thing, which is the second issue:
> 
> b) should ZFS access smaller block when ashift demands larger ones?
> 
> My answer would be "No" and in my opinion, this is the real underlying
> issue that needs to be addressed -- I would not really mind if we
> revert r253441 for now, but it would make us diverge further from
> upstream and, on the other hand, doesn't really buy us anything and
> papers the real issue out.

When I first thought about this I was in agreement that it really
should deal with this case but then I thought about the overhead
this would add for and every single IO request, and was would
that really be worth it?

Given the performance impact that is very evident when you use
SSD's that lie about their sector size I'd have to say I don't
think so.

Given this I currently plan to raise this upstream with the view
to have this change backed out. If iilumos wants to allow disk
sector size quirks it should do so properly like we've been
talking about at i.e. by using a combination of physical and
logical sector size reporting.

    Regards
    Steve



    Regards
    Steve


================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster at multiplay.co.uk.



More information about the zfs-devel mailing list