ZFS recovery after power failure

Wed Dec 22 16:31:21 UTC 2010

Hi Sergey,

I am curious on the details of how you did this in case I ever need to do it in the future.

I presume what you did was...

A) grab the uberblocks off of each disk to text files using dd or something?

B) find the most recent uberblocks (they should be round robin, with incrementing transaction group numbers so fairly easy to analyze)?

C) you cleared out the most recent two uberblocks on each disk by making their transaction group (uint64_t ub_txg) to zero or something, using dd again?

D) Did you also have to recalculate the checksum (uint64_t ub_guid_sum), or were you able to just leave it as-is and ZFS was okay, and then subsequently just wrote out a new/valid uber block to that position in the array as new transactions began after you were able to get the pool remounted?

Thanks,

- Mike

On Dec 19, 2010, at 7:13 AM, Sergey Gavrilov wrote:

> I've destroyed 2 latest uberblocks and imported pool. It's ok now.
> Your command don't work for me as is, but all tgx nubmers of all labels are
> the same and equal 666999 now.
> Think it's already useless information.
> So I saved those uberblocks. I can provide them if you need.
> 
> 2010/12/19 Pawel Jakub Dawidek <pjd at freebsd.org>
> 
>> On Sat, Dec 18, 2010 at 11:21:52AM +0300, Sergey Gavrilov wrote:
>>> zpool import -F pool2 ok, but
>>> zpool status -xv
>>>  pool: pool2
>>> state: FAULTED
>>> status: The pool metadata is corrupted and the pool cannot be opened.
>>> action: Destroy and re-create the pool from a backup source.
>>>   see: http://www.sun.com/msg/ZFS-8000-72
>>> scrub: none requested
>>> config:
>>> 
>>>    NAME        STATE     READ WRITE CKSUM
>>>    pool2       FAULTED      0     0     1  corrupted data
>>>      raidz2    ONLINE       0     0     6
>>>        da9     ONLINE       0     0     0
>>>        da10    ONLINE       0     0     0
>>>        da11    ONLINE       0     0     0
>>>        da12    ONLINE       0     0     0
>>>        da13    ONLINE       0     0     0
>>>        da14    ONLINE       0     0     0
>>>        da15    ONLINE       0     0     0
>>>        da16    ONLINE       0     0     0
>>> 
>>> zpool clear pool2
>>> cannot clear errors for pool2: I/O error
>>> 
>>> Is there any way to recovery data or portion of data at least.
>> 
>> Could you provide output of:
>> 
>>       # apply "zdb -l /dev/da%1 | egrep '(^LABEL|txg=|)'" `jot 8 9`
>> 
>> --
>> Pawel Jakub Dawidek                       http://www.wheelsystems.com
>> pjd at FreeBSD.org                           http://www.FreeBSD.org
>> FreeBSD committer                         Am I Evil? Yes, I Am!
>> 
> 
> 
> 
> -- 
> Best regards,
> Sergey Gavrilov
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"