Harddisk failure causes system crash, please help

Jeremy Chadwick koitsu at FreeBSD.org
Thu Nov 8 22:52:07 PST 2007

On Fri, Nov 09, 2007 at 08:29:52AM +0200, David Naylor wrote:
> I remember seeing a timeout of sorts once, it was while doing a dd.  I
> have done further dd tests and only the one slice causes this problem:
> ad0e

Okay, so it's probably that area of the disk which has some problem...

> > broken somehow), but all your problems seem to indicate issues with the
> > disk.
> Do you know of any test I can run using Windows (BartPE) that could
> possibly diagnose the problem (or at least confirm it is not FreeBSD's
> fault for rebooting and just hardware error)?

There's a free utility called HDTune which has a sector scanner which
explicitly looks for bad sectors ("Error Scan").  I would *uncheck* the
Quick Scan box.  If nothing shows up there, I'd check your Event Log to
see if there's any reports of disk/controller issues.

You might also be able to use that utility to get SMART stats for the
drive, although smartctl -a /dev/ad0 should suffice too.  The disk
itself may have been relocating data onto working sectors all this time;
usually SMART will show that (but not always -- depends on how the disk
manufacturer did their firmware).

But keep in mind Windows is one of the most silent OSes I've ever seen
when it comes to disk errors.  A disk can be failing miserably and it'll
never bother to report ATA timeouts or anything else in the event log.
The easiest ones to detect are mechanical failures, since all disk I/O
will stop ("why is my machine hanging?!?"), and if you're "lucky",
you'll hear the drive making scary noises.

