drive failure during rebuild causes page fault
Joe Rhett
jrhett at meer.net
Tue Dec 14 16:54:10 PST 2004
Soren, do you have any thoughts on what I could do to alleviate or better
debug this page fault? I've found three ways to cause this:
in all cases "pull" is either physical pull or "atacontrol detach <channel>"
1. Pull a drive and rebuild onto hot spare. Pull hot spare *boom*
2. Pull a drive and rebuild onto hot spare. Pull good disk *boom*
...should cause filesystem failure, but not page fault when it's not /
3. Pull a drive and then put it back. The system suddenly has a new array
with just that drive in it. "atacontrol delete <new-array>" *boom*
In particular, what's the story with the new array appearing when you
insert a drive with array meta-data on it? That array appears to be
half-there (no devices, etc) which is probably what causes #2...
On Tue, Dec 14, 2004 at 07:58:53AM +0100, Søren Schmidt wrote:
> Actually I'm in the process of rewriting the ATA RAID code, so things
> are rolling, albeit slowly, time is a precious resource. I belive that
> it can be made pretty robust, but the rest of the kernel still have
> issues with disappearing devices etc thats out of ATA's realm.
>
> Anyhow. I can only test with the HW I have here in the lab, which by far
> covers all possible permutations, so testing etc by the community is
> very much needed here to get things sorted out...
--
Joe Rhett
Senior Geek
Meer.net
More information about the freebsd-stable
mailing list