Is this a hard drive crash?

Bill Moran wmoran at potentialtech.com
Fri Mar 30 10:43:16 UTC 2007


web at 3dresearch.com wrote:
>
> At 11:03 AM 3/29/2007, you wrote:
> >Janos Dohanics wrote:
> >
> >>I also ran the Seagate drive utility which found no problems with the
> >>drive. When the same kind of crash happened again, I thought the problem
> >>may be the IDE controller, I have replaced the motherboard.
> >>Now it crashed again with the new motherboard - and I don't know what
> >>should I do next: should I just replace an apparently good hard drive?
> >
> >If you know the drive is good (e.g. by testing it in another machine), the 
> >first thing you should replace is the power supply. Weird drive behavior 
> >is often a sign of weak PSUs.
> 
> FWIW, I tested the power supply with an inexpensive power supply tester; it 
> checked out. I guess I should replace it, nonetheless...
> 
> >>Also, if this is just a hard drive crash, shouldn't  the system keep
> >>going?
> >
> >So, you're saying that if a drive starts giving invalid or 
> >noninterpretable communications back to the IDE controller, causing the 
> >controller to wedge, which possibly brings down the PCI bus on which it's 
> >connecte, tied to the front side bus and the CPU, the OS should just 
> >continue? On what?

This does actually happen sometimes.  I had a desktop PC where the
drive failed in this manner -- the drive just "went away" and the
OS kept going.  I'm not saying that you're wrong -- I'm just saying
that PC hardware is weird enough that strange things are not only
possible, they're likely.

> I guess you make a good point: the fact that the system wedges, points to 
> something other than the drive. Still, it's always the same drive that quits...

It's almost definitely a HDD crash.

The problem is that if the circuitry on HDD has gone flakey, it may
pass the Seagate test just fine.  In order for Seagate's test to
fail, you'd have to test it while the drive is flakey.  If I
understand your description of the problem, the drive works most of
the time, and crashes occasionally.  As a result, statistically,
the drive is probably fine when you're running the Seagate tests.

We have the same problem with RAM going bad.  RAM tends to go bad
by becoming unpredictable.  Then someone runs memtest for 10 minutes
and says, "nope, RAM is just fine".  Which is nonsense.  If the
RAM was bad every 10 minutes, your computer would be completely
unusable.  If it's only crashing once a day, you need to let memtest
run all day to see if can catch the problem in the act.  Hard drives
_usually_ fail dramatically -- but occasionally they fail in the
same way RAM does, which seems like what's happening to you.  A lot
of techs don't see this case very often (because it doesn't happen
like this very often) so don't recognize it.

-- 
Bill Moran
http://www.potentialtech.com


More information about the freebsd-questions mailing list