dealing with a failing drive
jdow at earthlink.net
Wed Nov 14 17:26:21 PST 2007
From: "Jerry McAllister" <jerrymc at msu.edu>
Sent: Monday, November 12, 2007 12:53
> On Mon, Nov 12, 2007 at 09:26:38AM -0800, David Newman wrote:
>> On 11/12/07 8:14 AM, Jerry McAllister wrote:
>> > An update: After doing what you suggest (leaving in the "good" disk,
>> > adding a new disk, RAID rebuilding) I still got soft write errors --
>> > with *either one* of the disks I tried.
>> > Then I tried putting both disks in an identical server and they came up
>> > fine, no read or write errors.
>> > Ergo, the bad RAID controller is bad and the disks may be OK.
>> >> Probably not.
>> >> Generally, if the RAID controller is bad, you will see errors
>> >> all over and not it just one place, tho I suppose it is possible.
>> >> Check and see what it reports as error locations and see if they
>> >> move around any.
>> Jerry, thanks for your response.
>> After 36 hours of running the same disks in a different, identical
>> machine there hasn't been a single read or write error. I'm hardly a
>> storage expert but from the evidence I have I'm inclined to believe the
>> root cause was a bad RAID controller and not failed disks.
> That is not much proof.
> The different machine would probably be accessing the disks in
> a different way, either slightly different positioning or using
> different space. Also, 36 hours is not really much time.
Dn, I have had a Promise controller that was bad. I kept getting errors
at one specific location on two disks out of three on a RAID 5. The
system continued to operate. When I finally spent the time to nail it
down to the controller I found the Promise people more than anxious to
get the beast for a postmortem. It had been bad for me from day one. It
would take about a week to a month for the problem to appear. After the
6th disk showing the problem at the same block number the coin dropped
in my sometimes overly slow mind.
More information about the freebsd-questions