unintended ATARAIDDELETE

Barney Wolff barney at databus.com
Sat Oct 18 19:03:12 PDT 2003


On Sat, Oct 18, 2003 at 04:14:53PM -0700, Doug White wrote:
> 
> > I've had a very odd problem with a -stable system on an Asus A7V333-raid,
> > which has a Promise raid controller on the motherboard.  For several days
> > in a row the system lost its raid0 array during the 3am daily run, leaving
> > it with no disk.  The raid was actually turned off in the bios, with
> > manual intervention required on reboot to turn it back on.  I suspected
> > hardware, but in desperation booted a -stable kernel from 10/3/03.  That
> > kernel survived the daily run, and reported the following:
> > Oct 14 14:41:43 192.168.24.4 /kernel.maybe.ok: ad6: hard error reading fsbn 133757952 of 0-127 (ad6 bn 133757952; cn 132696 tn 6 sn 6) trying PIO mode
> > (I should note that I added a script in /usr/local/etc/periodic/daily to
> > back up this system, so files are read that normally see no access.)
> 
> This usually means your disk is bad, which is why it keeps trashing the
> array.  Your system is trying to tell you something :-)

Well of course the bad block is h/w.  But deleting a raid0 on a hard
error is insane.  I can more-or-less understand for raid1 why that
might be thought sensible, but a split raid0 is of no use for anything.
Nor could I find anywhere in the kernel that actually deletes the raid.
But for sure -stable from 9/24 behaved differently (ie, sanely) on
getting the error than -stable from 10/13 or so.  I don't think that's
hardware.  Time will tell, perhaps.

-- 
Barney Wolff         http://www.databus.com/bwresume.pdf
I'm available by contract or FT, in the NYC metro area or via the 'Net.


More information about the freebsd-stable mailing list