GEOM_MIRROR after crash not identical

Frank B. Scholl frank.b.scholl at web.de
Wed Jun 21 06:50:49 UTC 2006


hello list,

i just wanted to describe what happened to me last night.

i have an ultra 10 running with a highpoint ide controller, on each channel 
there are udma100 drives. booting is done via compact flash with the onboard 
controller. the two udma100 drives form a mirror, which is encrypted with 
geli. after creating the device with gmirror and inserting the other disk, 
everything ran fine.

then i needed to power down the machine to add more drives. after the machine 
came up, the mirror was degraded. it always failed to insert the second disk 
as a valid provider.

wouldnt be that bad, i thought, just remount the degraded mirror readonly and 
backup data, after that lets see what can be done. well, after mounting i saw 
a lot of data missing, to be exact 180gb from 300gb total. so i thought it 
might be a problem of the filesystem and tried to fsck it. didnt work either, 
there are a lot of unreadable blocks on the device, geom_geli through 
geom_mirror didnt stop flooding my logs. problem: dma timeouts and interrupt 
storms en masse - with a controller that beforehand worked flawlessly over 
_weeks_ without any reboot. so i forced hw.ata.ata_dma and hw.ata.atapi_dma 
to zero and circumvented the timeout problem. changing cables didnt work, 
either, btw, what seems to be common practice in such cases.

so far, i ve only worked on the first disk and decided to go to sleep after 
having lost 180gb of data to a mirror device. next morning, i woke up and had 
quite a good idea: lets try the same thing - mounting the degraded mirror - 
again, but with the second provider only. so i unplugged the first disk, 
booted, and see, it worked.

so now.. how can that be that after a single reboot the two providers are not 
exactly the same? the thing is.. last data i have on the first provider is 
from 14 june, on the second provider i have everything until 20 june. the 
machine was powered down yesterday and was running at least since this month, 
if not longer. i went through the logfiles and there was not a single hint, 
where geom_mirror claimed about inconsistency. 

any ideas? my data is back, so i dont cry anymore. is this a problem due to my 
platform choice? namely sparc64?

thanks for an answer, cheers,

frank scholl


More information about the freebsd-geom mailing list