ZFS pool faulted (corrupt metadata) but the disk data appears ok...

Michelle Sullivan michelle at sorbs.net
Mon Feb 9 13:19:29 UTC 2015


Stefan Esser wrote:
>
> The point were zdb seg faults hints at the data structure that is
> corrupt. You may get some output before the seg fault, if you add
> a number of -v options (they add up to higher verbosity).
>
> Else, you may be able to look at the core and identify the function
> that fails. You'll most probably need zdb and libzfs compiled with
> "-g" to get any useful information from the core, though.
>
> For my failed pool, I noticed that internal assumptions were
> violated, due to some free space occuring in more than one entry.
> I had to special case the test in some function to ignore this
> situation (I knew that I'd only ever wanted to mount that pool
> R/O to rescue my data). But skipping the test did not suffice,
> since another assert triggered (after skipping the NULL dereference,
> the calculated sum of free space did not match the recorded sum, I
> had to disable that assert, too). With these two patches I was able
> to recover the pool starting at a TXG less than 100 transactions back,
> which was sufficient for my purpose ...
>   

Question is will zdb 'fix' things or is it just a debug utility (for
displaying)?

If it is just a debug and won't fix anything, I'm quite happy to roll
back transactions, question is how (presumably after one finds the
corrupt point - I'm quite happy to just do it by hand until I get
success - it will save 2+months of work - I did get an output with a
date/time that indicates where the rollback would go to...)

In the mean time this appears to be working without crashing - it's been
running days now...

  PID USERNAME       THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU
COMMAND
 4332 root           209  22    0 23770M 23277M uwait   1 549:07 11.04%
zdb -AAA -L -uhdi -FX -e storage

Michelle

-- 
Michelle Sullivan
http://www.mhix.org/



More information about the freebsd-fs mailing list