geli+trim support

Warner Losh imp at bsdimp.com
Wed Jul 9 16:22:33 UTC 2014


On Jul 5, 2014, at 2:36 AM, Poul-Henning Kamp <phk at phk.freebsd.dk> wrote:

> In message <53B750C1.8070706 at gooch.io>, Jesse Gooch writes:
> 
>>> If you TRIM, your old sector is still unchanged somewhere in flash, but
>>> if you're lucky for slightly less time.
>> 
>> Perhaps I misunderstand TRIM, isn't the point of TRIM that it zeroes out
>> the sector ahead of time so it doesn't have to re-do it again when it
>> stores more data in that sector later?
> 
> Yes.
> 
> But "ahead of time" does not mean "now."
> 
> It's a fairly lenghty explanation, but the short version is that TRIM'ing
> a sector means that the FTL knows you don't care about the contents of
> the sector, so it need not be preserved during "washes".
> 
> When the washes actually happen depends on how large the actual free-pool
> is and very strongly on if an eraseblock happens to be all TRIM'ed and
> finally on the wear-levelling algorithm and the characteristics of the
> flash that informs it.
> 
> There is no way to characterize any of these things, without full
> acces to the FLT.

The only way to be sure the data is gone is a secure erase. And even then
it can only be on a best-effort basis because the NAND chips’ charge pumps
can and do fail and once that happens, you can no longer erase anything
on that part.

Other than that, PHK is right: the FTL decides when it will “groom” or “garbage
collect” the old erase block that contains the disk block that you just trimmed.
The erase block is hundreds of pages long. Each page holds several disk blocks
in a typical implementation. The NAND flash simply cannot program on a
sub-page basis[*], program a page twice[**] or erase with any granularity smaller
than an erase block.

The one thing that PHK forgot to mention is that flash devices are laid out in a log
fashion, e.g. the next written block follows the previously written block (more or less)
in the physical NAND media, which is why the FTL is involved at all. Again, this
is due to pages and erase blocks and the write once then erase physics of NAND.

Warner

[*] Well, in some rare edge cases you can, but most modern chips don’t let you do
that reliably, and certainly not for previously programmed pages in a fully programmed
block.
[**] the firmware usually prevents you from doing this, especially in MLC designs where
you program the cells twice with data from two different pages, but that’s a different
kettle of fish...


More information about the freebsd-hackers mailing list