drive failure during rebuild causes page fault

Søren Schmidt sos at
Mon Dec 13 22:59:26 PST 2004

Doug White wrote:
> On Mon, 13 Dec 2004, Joe Rhett wrote:
>>>This is why I don't trust ATA RAID for fault tolerance -- it'll save your
>>>data, but the system will tank.  Since the disk state is maintained by
>>>the OS and not abstracted by a separate processor, if a disk dies in a
>>>particularly bad way the system may not be able to cope.
>>Yes, but SATA isn't limited by this problem.  It does have a processor per
>>disk. (this is all SATA, if I didn't make that clear)
> Actually on SATA its worse -- the disk just stops responding to everything
> and hangs.  If you don't detect this condition then you go into an
> infinite wait.
> In any case, yes the ATA RAID code could use a massive robustness pass. So
> could the core ATA code.  Patches accepted :)

Actually I'm in the process of rewriting the ATA RAID code, so things 
are rolling, albeit slowly, time is a precious resource. I belive that 
it can be made pretty robust, but the rest of the kernel still have 
issues with disappearing devices etc thats out of ATA's realm.

Anyhow. I can only test with the HW I have here in the lab, which by far 
covers all possible permutations, so testing etc by the community is 
very much needed here to get things sorted out...



More information about the freebsd-stable mailing list