drive failure during rebuild causes page fault
sos at DeepCore.dk
Mon Dec 13 22:59:26 PST 2004
Doug White wrote:
> On Mon, 13 Dec 2004, Joe Rhett wrote:
>>>This is why I don't trust ATA RAID for fault tolerance -- it'll save your
>>>data, but the system will tank. Since the disk state is maintained by
>>>the OS and not abstracted by a separate processor, if a disk dies in a
>>>particularly bad way the system may not be able to cope.
>>Yes, but SATA isn't limited by this problem. It does have a processor per
>>disk. (this is all SATA, if I didn't make that clear)
> Actually on SATA its worse -- the disk just stops responding to everything
> and hangs. If you don't detect this condition then you go into an
> infinite wait.
> In any case, yes the ATA RAID code could use a massive robustness pass. So
> could the core ATA code. Patches accepted :)
Actually I'm in the process of rewriting the ATA RAID code, so things
are rolling, albeit slowly, time is a precious resource. I belive that
it can be made pretty robust, but the rest of the kernel still have
issues with disappearing devices etc thats out of ATA's realm.
Anyhow. I can only test with the HW I have here in the lab, which by far
covers all possible permutations, so testing etc by the community is
very much needed here to get things sorted out...
More information about the freebsd-stable