Bad Blocks... Should I RMA?
smithi at nimnet.asn.au
Tue Nov 17 15:52:05 UTC 2009
In freebsd-questions Digest, Vol 285, Issue 3, Message 28
On Mon, 16 Nov 2009 23:16:27 +0100 Roland Smith <rsmith at xs4all.nl> wrote:
> On Mon, Nov 16, 2009 at 09:43:31PM +0000, Bruce Cran wrote:
> > On Mon, 16 Nov 2009 19:23:58 +0100
> > Roland Smith <rsmith at xs4all.nl> wrote:
> > > Install the smartmontools port, and check the drive with
> > > 'smartctl -a /dev/ad4'. If you see a non-zero Reallocated_Sector_Ct,
> > > RMA it immediately, as it is about to fail. If see other errors
> > > reported, RMA it.
> > >
> > > (S)ATA disk have spare sectors available. If a sector fails, it is
> > > replaced by one of the spares by the firmware. If you see a non-zero
> > > Reallocated_Sector_Ct, it means that the drive has run out of spares.
> > > This is bad news.
> > Surely it's the other way around - if you see a value of zero in the
> > "value" column the drive has run out of spare sectors and it's time to
> > RMA the drive?
> I was talking about the _RAW_VALUE column. There seems to be some differences
> in interpretation between vendors as to what the VALUE column means. Most of
> the advice I've seen over the years says to look at the RAW_VALUE.
> See http://en.wikipedia.org/wiki/S.M.A.R.T. as well.
Mmm, but as that article - which really only mentions the 'normalised'
values smartctl presents in passing - points out, there can be quite a
lot of variation between different manufacturers as to what RAW_VALUE
actually represents for various attributes, whereas the usage of VALUE
WORST THRESH values is much more consistent, and what the vendor is
actually presenting as the SMART good/fair/fail analysis to the world.
For instance, I've got two Fujitsu 5400rpm 2.5" drives in two laptops,
one MHV2040AH with near 19,000 hours on it, and a much newer MHV2120AH,
40 and 120GB respectively. Nice quiet low-power laptop drives, fwiw.
Both show as (more recently) being in the smartctl database, and both
show _exactly_ the same values for this one:
5 Reallocated_Sector_Ct 0x0033 100 100 024 Pre-fail Always - 8589934592000
Now if that were a number of 512-byte sectors, it'd be 4096000 GB! :)
but both drives are 100% ok, as the VALUE / WORST figures show.
> > From what I've seen the 'raw' column appears to count
> > the number of sectors the drive has remapped using the spares buffer.
> > If it gets into the hundreds it's probably time to think about RMA'ing
> > the drive
> Yes, the raw value is the number of sectors allocated from the spares. I
> originally thought it was the number of reallocations _beyond_ the
> spares. That's a misunderstanding on my part.
Again, may depend on the drive make/model. With the same make/model you
can of course usefully compare raw values, but be careful about drawing
inferences for different drives, or you may be RMA'ing needlessly ..
> Nevertheless this attribute (along with several) is marked on the Wikipedia
> page for smart as a "Potential indicator of imminent electromechanical
> failure". You can find the same attributes marked as critical when perusing
> mailing list archives.
> For me, my data is worth much more than the harddisk it is on. Some of it is
> literally irreplacable. So my policy is to go look for a replacement harddisk
> as soon as the RAW_VALUEs of any of these critical indicators start going up
> from zero. And store any data at least on two harddisks, whether in a mirror
> or in a cron+rsync setup.
That'd be the case for the disks you tend to use. I was first going to
reply to Bruce's message when I spotted yours, but you've dropped the
last bit of his quote, that I was about to wholeheartedly agree with :)
: If it gets into the hundreds it's probably time to think about RMA'ing
: the drive - if you trust that the 'raw' column is reporting what you
: think it is (you should really only base your decision on the value,
: worst and threshold columns).
More information about the freebsd-questions