# Yet another RAID Question (YARQ)

Sandy Rutherford sandy at krvarr.bc.ca
Thu Jun 23 08:15:04 GMT 2005

```>>>>> On Wed, 22 Jun 2005 23:37:20 -0700,
>>>>> "Ted Mittelstaedt" <tedm at toybox.placo.com> said:

> Seagate wrote a paper on this titled:

> "Seagate Technology Paper 338.1 Estimating Drive Reliability in
> Desktop Computers and Consumer Electronic Systems"

> that explains how they define MTBF.  Basically, they define MTBF as
> what percentage of disks will fail in the FIRST year.

Is this in the public domain?  I wouldn't mind having a look at it.

> What they are saying is if you purchase 160 Cheetahs and run them at
> 100% duty cycle for 1 year then there is 100% chance that 1 out of the
> 160 will fail.

> Thus, if you only purchase 80 disks and run them at 100% duty cycle for 1
> year, then you only have a 50% chance that 1 will fail.  And so on.

> Ain't statistics grand?  You can make them say anything!  For an encore
> Seagate went on to prove that their CEO would live 3 centuries
> by statistical grouping. :-)

Now don't knock statistics.  The problem does not lie with statistics,
but with its misuse by people who do not understand what they are
doing.  No, I am not a statistician; however, I am a mathematician.

> So, in getting back to the gist of what I was saying, the issue is
> as you mentioned standard deviation.  I think we all understand that
> in a disk drive assembly line that it's all robotic, and that there
> is an extremely high chance that disk drives that are within a few
> serial numbers of each other are going to have virtually identical
> characteristics.  In fact I would say using the Seagate MTBF definition,
> that 1 in every 160 drives manufactured in a particular run is going
> to have a significant enough deviation to fail at a significantly
> different
> period of time, given identical workload.

I am not so sure.  If we were talking about can openers, I would
agree.  However, a disk drive is basically a mechanical object which
performs huge numbers of mechanical actions over the course of a
number of years.  Even extremely minute variations in the
physical characteristics of the materials could lead to substantive
variations over time.  However, the operative word here is "could".
Real data is required.  I tried to google for a relevant study, but
came up empty.  This surprised me as it seems like the sort of thing
that masses of data should have been collected for.

> In short you have better than 99% chance that if you install 2 brand
> new Cheetahs that are from the same production run, they will have
> virtually identical characteristics.  And, failure due to wear is going
> to be
> very similar - there's only so many times the disk head can seek
> before it's bearings are worn out - and your proposing to give them
> the exact same usage.

> I think the reason your seeing alternation is that the disks are
> so damn fast that they complete their reads well before their internal
> buffers have finished emptying themselves over the SCSI bus to the
> array card.  In other words, you wasted your money on your fast
> disks,

Not much money.  After having been burned by failures of lower end
drives, I bought high-end stuff on EBay.  Made me nervous at the
beginning, because who knows how many flights of stairs the drive
bounced down before it was popped into the mail, and for that matter,
who knows how many flights of stairs it bounced down while it was in
the mail.  However, so far it has worked out quite well.

> if you had used slower disks you would see identical read performance
> but you would see less alternative flickering
> and more simultaneous and continuous activity.

> If you got a faster array card you wouldn't see the alternative
> flickering.

> Or, it could be the PCI bus not being fast enough for the array card.

It's almost certainly the PCI bus.  The DAC1100, although not
state-of-the-art, is still reasonably fast.  It has 3 U2W channels and
it could certainly max out my PCI bus.

> Ah well, a computer just wouldn't be a computer without blinking
> lights on it!!! ;-)

Gotta agree there;-) Once upon a time I had the dip switch settings
required to boot a PDP-11 from the front panel memorized, because I
had to do it so often.  Our data runs extended far beyond the typical
uptime, so we did checkpoints by dumping the relevant bits of core to
a teletype and I used to have to re-type in the data from the teletype
when we brought it back up after a crash.  Even on an old PDP-11, this
took a while.  We needed 3 months+ of uptime and we did well if we
could keep that thing up for longer than a week.  I became
well-acquainted with those dip switches.

Sandy
```