graid3 lost disk - array still fails

Tommi Lätti sty at blosphere.net
Mon Jan 2 06:36:51 PST 2006


A few hours ago, my customers graid3 array crashed due one hard-drive 
loss and it's unable to recover. The data is easily replaceable so no 
loss of sleep for that but I'd really like to hear some ideas what 
happened, if possible.

Since this was 'do-it-cheaply', we got 3x160G seagates, all old pata 
type and put the in as primary master, secondary master and slave. Not 
the best possible combo I know but it worked.

Now the secondary master died a bit earlier, and the array started 
rebuilding, and then somebody rebooted the machine while it was 
rebuilding ad3...

ad2: FAILURE - READ_DMA status=51<READY,DSC,ERROR> 
error=40<UNCORRECTABLE> LBA=286404016
GEOM_RAID3: Request failed. ad2[READ(offset=146638856192, length=8192)]
GEOM_RAID3: Request failed. raid3/gr0[READ(offset=293277712384, 
length=16384)]
GEOM_RAID3: Device gr0: provider ad2 disconnected.
GEOM_RAID3: Device gr0: provider raid3/gr0 destroyed.
GEOM_RAID3: Device gr0: rebuilding provider ad3 stopped.
GEOM_RAID3: Synchronization request failed (error=6). 
ad3[WRITE(offset=973602816, length=
65536)]
GEOM_RAID3: Device gr0: provider ad3 disconnected.
GEOM_RAID3: Device gr0 destroyed.

So now that the ad2 is removed, graid3 still reports that ad3 is broken 
(GEOM_RAID3: Component ad3 (device gr0) broken, skipping.) and then 
proceeds to remove the array since that was the second disk already and 
there are not enough disks left...

Now, the question would be that is there any way I could lie to the 
graid3 that the ad3 is okay?

I'm pretty sure that there were no writes to the array during the time 
the ad2 crashed so maybe some data would still be recoverable?

-- 
br,
Tommi


More information about the freebsd-stable mailing list