DELETE support in the VOP_STRATEGY(9)?

Warner Losh imp at bsdimp.com
Tue Dec 8 19:58:54 UTC 2015


> On Dec 8, 2015, at 12:46 PM, Dag-Erling Smørgrav <des at des.no> wrote:
> 
> Warner Losh <imp at bsdimp.com> writes:
>> Dag-Erling Smørgrav <des at des.no> writes:
>>> My point is that it's wrong to infer anything else from
>>> GEOM::candelete than the fact that BIO_DELETE requests will be
>>> accepted and may or may not do something, somewhere, at some point.
>>> We can easily create a different GEOM attribute which indicates that
>>> seeks are essentially free, and FFS could use that instead of
>>> GEOM::candelete to disable relocation.
>> When this was implemented, we thought about that. But we couldn't come
>> up with any cases where you'd have one set and not the other.  And the
>> actual thing you'd want isn't that seeks are free, though that's a
>> good clue. The actual thing you want is to know if there's a
>> performance benefit to keeping files contiguous, and the extent size
>> where that stops making sense.
> 
> I'm having a hard time understanding how the fact that seeks are
> essentially free is *not* a good indication that there is no benefit to
> keeping files contiguous, since keeping files contiguous is something we
> do to avoid the cost of seeking.  Support for deletion, on the other
> hand, is *completely* orthogonal.  And my example was not taken entirely
> out of the blue: I'm sure there would be a huge market for storage
> devices, whether electromechanical or solid-state, which implemented
> this in hardware, along with guarantees that reallocated sectors are
> truly non-recoverable.

I’d say it’s not nuanced enough. Seeks may be essentially from for SSDs,
but there’s still some benefit to clustering writes. Only the SSD’s firmware
can know what units it would prefer to write things in so that it spreads the
sectors across however many banks / chips make up one unit internally
that can be done in parallel.

While not perfect, GEOM::candelete gives an indicate that the device does
storage management. In the vast majority of the cases in actual hardware,
this means that the actual physical media is obscured by at least one
layer of indirection. Since there’s the layer of indirection, assumptions about
continuity are out the window. While there may be a tiny fraction where
drives try to shoe-horn ‘reliable erasure’ into data set management trim
operations, or similar, I’d imagine that to get the assurances they’d want
from the OS and filesystems, they’d implement a new primitive or attribute
which would allow them to use a more robust command set to ensure when
the feature is engaged it’s working as advertised. So using GEOM::candelete
is good enough until actual problems can be demonstrated. We can do the
work then to solve the problem found with it rather than guess about what
the problems might be and design to that.

And to be fair, having an additional property of ‘seeks are nearly free’ would
also be a good way to tell. I’m not convinced it is worth the effort to add it to
all the storage devices in the tree when GEOM::candelete is a good proxy.
I guess if we’re going to the effort, I’d like there to be a richer set of data
provided than just ‘seeks are nearly free’.

Warner
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20151208/49c149db/attachment.sig>


More information about the freebsd-hackers mailing list