how to measure microsd wear

Sat Jan 21 23:40:15 UTC 2017

On 1/21/2017 13:11, Ian Lepore wrote:
> On Sat, 2017-01-21 at 12:12 -0600, Karl Denninger wrote:
>> On 1/21/2017 11:58, Ian Lepore wrote:
>>> On Sat, 2017-01-21 at 15:46 +0000, tech-lists wrote:
>>>> Hello list,
>>>>
>>>> How would one measure microsd wear? Is there a utility like
>>>> smartmontools (I think this only works for regular hard drives)
>>>> but
>>>> for
>>>> microsd?
>>>>
>>>> many thanks,
>>> There is basically no way to see what's going on in the flash array
>>> of
>>> an sdcard.  The microcontrollers in modern sd cards have complex
>>> wear-
>>> leveling algorithms which are completely transparent to the outside
>>> world.
>> This is true.
>>> On the plus side, most of what you see in the way of warnings and
>>> scare
>>> stories about wearing out sd cards is pure BS.  I've got systems
>>> here
>>> that have been running for literally years on the same sdcard, and
>>> that
>>> card is being used for swap, and routine data storage like syslog
>>> (on
>>> an embedded system that logs status and progress pretty much
>>> continuously 24x7 for years).  I've seen a few sd cards die over
>>> the
>>> years, but I've never been able to say it was because of how much
>>> was
>>> written to them (indeed, the dead ones I've got weren't in service
>>> long
>>> before they died).
>>>
>> This, however, is total nonsense.
>>
> Well, no, it's not total nonsense, it's my 10 years of experience
> professionally working with sd cards in embedded systems sold as
> commericial products, including extensive testing of the card trying to
> *induce* failure.
>
> Next time think twice before implying I'm either a fool or a liar.
I didn't say you're a fool or a liar, I said you're *wrong* with the
statement that SD cards don't fail in this application -- specifically
microSD cards on embedded machines such as the RPI series running
FreeBSD, along with the implication was that the only failures you will
see are infant-mortality related.

In fact I just had a failure *today* on a production RPI2.    While
trying to re-copy the filesystem back to it (after it failed, plugged
into a different box to check it out) I got the usual behavior:

(da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 00 1c 4f 40 00 00 80 00
(da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da0:umass-sim0:0:0:0): Retrying command

The writing process is frozen because the card has write-locked itself. 
I confirmed this by attempting to fsck the card after detaching and
reattaching it, and as soon as fsck attempted to write to it another
error was taken; it's effectively write-locked.  I can mount it
read-only and read it, but cannot write to it.

Another one goes into the bin, and a new card comes out to be set up
with said filesystem.  In fact I'm doing that right now while typing this.

This card was roughly a year old in production use, and gets a fair bit
of write activity -- but not a crazy amount by any means.

Your mileage may vary, but this is what I've repeatedly seen in behavior
by these cards when they go bad -- and this one is not a low-hour
failure either, nor is it an off-brand -- it's a Sandisk Ultra 32Gb and
the machine has roughly a year of 24x7x365 uptime on it. 

Sandisk will replace them on request but they most-definitely do
occasionally fail.  I've started using Samsung EVO cards and I've yet to
have any of those crap out, but none of them have more than six months
of uptime on them at this point and the failures are rare enough that
until I get a couple of years on the EVOs without a failure I can't
reasonably say they're superior in this regard.

This is the first one that I've had happen in this particular use case
and I was surprised by it because of
what that machine does.  The other two were both much more write-heavy
applications, one of them a development unit on my desk that gets a lot
of compile activity on it.

> I'm not even going to read the rest of the crap you wrote, since it's
> completely invalidated by the stupid thing you said above.
>
> -- Ian
>
>
>> I've had multiple SD card failures in build/test/high-volume write
>> environments on the PI2 series over the last year and change.  There
>> are
>> two general ways in which you will see failures:
>>
>> 1. The card write-locks itself. This is a defensive move by the
>> controller when it determines that it cannot reallocate a failed
>> block
>> during a write (e.g. it's out of spares) OR it takes an unrecoverable
>> read error.
>>
>> 2. The card loses its allocation map (in which case you're completely
>> screwed; it will show up as zero size if you manage to get it mounted
>> somewhere.)
>>
>> If you get a type 1 failure you can copy everything on the card off;
>> provided you do not attempt to write it, you will not get
>> errors.  Prior
>> to a fairly recent MFC if you had soft-updates on and took a Type 1
>> failure you'd get an instant panic; this has been (I believe
>> entirely)
>> fixed.
>>
>> In the event you get a Type 2 failure there's nothing you can do.  In
>> both cases the card is junk if it happens.
>>
>>

-- 
Karl Denninger
karl at denninger.net <mailto:karl at denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2993 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.freebsd.org/pipermail/freebsd-arm/attachments/20170121/852558af/attachment.bin>