zraid2 loses a single disk and becomes difficult to recover

Pawel Jakub Dawidek pjd at FreeBSD.org
Wed Oct 14 06:21:16 UTC 2009


On Mon, Oct 12, 2009 at 08:49:37PM +0100, Alex Trull wrote:
> I managed to cleanly recover all critical data by cloning the most recent
> snapshots of all my filesystems (which worked even for those filesystems
> that had disappeared from 'zfs list') - and moving back to ufs2
> 
> The 'live' filesystems since the snapshots had pretty much gone corrupt.
> 
> Intereresting note is that even if I promoted those clones - if the system
> was rebooted the contents of the snapshots became gobbledygooked (invalid
> byte sequence errors on numerous files).
> 
> As it stands I managed to recover 100% of the data, so I'm out the woods.

I'm glad to hear that.

> How does a dual-parity array lose its mind when only one disk is lost ?
> Might it have been related to the old TXGid I found on ad16 and ad17 ?

Yes, definiately. For some reason ZFS didn't update txg on those two
disks, so at this point you were running without parity. The problem is
that ZFS didn't start resilver automatically and also didn't report this
situation properly. I think I saw this in the past. Running 'zpool scrub'
on this pool will trigger resilver. There must be a bug. I tried to
reproduce it by modifying the code not to update txg on one of the
components. There are three places where this can happen on sytem
crash/power failure and I tried all of them - no luck, ZFS was able to
recover properly.

It would be good idea to run 'zpool scrub' on regular basis, even if
only to see if it won't trigger resilver (it can be stopped after few
minutes with 'zpool scrub -s'). Of course it is adviced to run full
scrub from time to time.

Do you have this pool around still?

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd at FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20091014/81be3a82/attachment.pgp


More information about the freebsd-fs mailing list