zraid2 loses a single disk and becomes difficult to recover
Alex Trull
alextzfs at googlemail.com
Mon Oct 12 19:49:40 UTC 2009
I managed to cleanly recover all critical data by cloning the most recent
snapshots of all my filesystems (which worked even for those filesystems
that had disappeared from 'zfs list') - and moving back to ufs2
The 'live' filesystems since the snapshots had pretty much gone corrupt.
Intereresting note is that even if I promoted those clones - if the system
was rebooted the contents of the snapshots became gobbledygooked (invalid
byte sequence errors on numerous files).
As it stands I managed to recover 100% of the data, so I'm out the woods.
How does a dual-parity array lose its mind when only one disk is lost ?
Might it have been related to the old TXGid I found on ad16 and ad17 ?
--
Alex
2009/10/11 Alex Trull <alextzfs at googlemail.com>
> Well after trying alot of things (zpool import with or without cache file
> in place, etc), it randomly managed to mount the pool up, atleast, with
> errors - :
>
> zfs list output:
> cannot iterate filesystems: I/O error
> NAME USED AVAIL REFER MOUNTPOINT
> fatman 1.40T 1.70T 51.2K /fatman
> fatman/backup 100G 99.5G 95.5G /fatman/backup
> fatman/jail 422G 1.70T 60.5K /fatman/jail
> fatman/jail/havnor 198G 51.7G 112G /fatman/jail/havnor
> fatman/jail/mail 19.4G 30.6G 13.0G /fatman/jail/mail
> fatman/jail/syndicate 16.6G 103G 10.5G /fatman/jail/syndicate
> fatman/jail/thirdforces 159G 41.4G 78.1G /fatman/jail/thirdforces
> fatman/jail/web 24.8G 25.2G 22.3G /fatman/jail/web
> fatman/stash 913G 1.70T 913G /fatman/stash
>
> (end of the dmesg)
> JMR: vdev_uberblock_load_done ubbest ub_txg=46475461
> ub_timestamp=1255231841
> JMR: vdev_uberblock_load_done ub_txg=46481476 ub_timestamp=1255234263
> JMR: vdev_uberblock_load_done ubbest ub_txg=46481476
> ub_timestamp=1255234263
> JMR: vdev_uberblock_load_done ubbest ub_txg=46475459
> ub_timestamp=1255231780
> JMR: vdev_uberblock_load_done ubbest ub_txg=46475458
> ub_timestamp=1255231750
> JMR: vdev_uberblock_load_done ub_txg=46481473 ub_timestamp=1255234263
> JMR: vdev_uberblock_load_done ubbest ub_txg=46481473
> ub_timestamp=1255234263
> JMR: vdev_uberblock_load_done ubbest ub_txg=46481472
> ub_timestamp=1255234263
> Solaris: WARNING: can't open objset for fatman/jail/margaret
> Solaris: WARNING: can't open objset for fatman/jail/margaret
> Solaris: WARNING: ZFS replay transaction error 86, dataset
> fatman/jail/havnor, seq 0x25442, txtype 9
>
> Solaris: WARNING: ZFS replay transaction error 86, dataset
> fatman/jail/mail, seq 0x1e200, txtype 9
>
> Solaris: WARNING: ZFS replay transaction error 86, dataset
> fatman/jail/thirdforces, seq 0x732e3, txtype 9
>
> [root at potjie /fatman/jail/mail]# zpool status -v
> pool: fatman
> state: DEGRADED
> status: One or more devices has experienced an error resulting in data
> corruption. Applications may be affected.
> action: Restore the file in question if possible. Otherwise restore the
> entire pool from backup.
> see: http://www.sun.com/msg/ZFS-8000-8A
> scrub: resilver in progress for 0h4m, 0.83% done, 8h21m to go
> config:
>
> NAME STATE READ WRITE CKSUM
> fatman DEGRADED 0 0 34
> raidz2 DEGRADED 0 0 384
> replacing DEGRADED 0 0 0
> da2/old REMOVED 0 24 0
> da2 ONLINE 0 0 0 1.71G resilvered
> ad4 ONLINE 0 0 0 21.3M resilvered
> ad6 ONLINE 0 0 0 21.4M resilvered
> ad20 ONLINE 0 0 0 21.3M resilvered
> ad22 ONLINE 0 0 0 21.3M resilvered
> ad17 ONLINE 0 0 0 21.3M resilvered
> da3 ONLINE 0 0 0 21.3M resilvered
> ad10 ONLINE 0 0 1 21.4M resilvered
> ad16 ONLINE 0 0 0 21.2M resilvered
> cache
> ad18 ONLINE 0 0 0
>
> errors: Permanent errors have been detected in the following files:
>
> fatman/jail/margaret:<0x0>
> fatman/jail/syndicate:<0x0>
> fatman/jail/mail:<0x0>
> /fatman/jail/mail/tmp
> fatman/jail/havnor:<0x0>
> fatman/jail/thirdforces:<0x0>
> fatman/backup:<0x0>
>
> jail/margaret & backup isn't showing up in zfs list
> jail/syndicate is showing up but isn't viewable
>
> It seems the latest content on the better-looking zfs filesystems are quite
> recent.
>
> Any thoughts about what is going on ?
>
> I have snapshots for africa on these zfs filesystems - any suggestions on
> trying to get them back ?
>
> --
> Alex
>
> 2009/10/11 Alex Trull <alextzfs at googlemail.com>
>
> Hi All,
>>
>> My zraid2 has broken this morning on releng_7 zfs13.
>>
>> System failed this morning and came back without pool - having lost a
>> disk.
>>
>> This is how I found the system:
>>
>> pool: fatman
>> state: FAULTED
>> status: One or more devices could not be used because the label is missing
>>
>> or invalid. There are insufficient replicas for the pool to continue
>> functioning.
>> action: Destroy and re-create the pool from a backup source.
>> see: http://www.sun.com/msg/ZFS-8000-5E
>> scrub: none requested
>> config:
>>
>> NAME STATE READ WRITE CKSUM
>> fatman FAULTED 0 0 1 corrupted data
>> raidz2 DEGRADED 0 0 6
>> da2 FAULTED 0 0 0 corrupted data
>> ad4 ONLINE 0 0 0
>> ad6 ONLINE 0 0 0
>> ad20 ONLINE 0 0 0
>> ad22 ONLINE 0 0 0
>> ad17 ONLINE 0 0 0
>> da2 ONLINE 0 0 0
>> ad10 ONLINE 0 0 0
>> ad16 ONLINE 0 0 0
>>
>> Initialy it complained that da3 had gone to da2 (da2 had failed and was no
>> longer seen)
>>
>> I replaced the original da2 and bumped what was originaly da3 back up to
>> da3 using the controllers ordering.
>>
>> [root at potjie /dev]# zpool status
>> pool: fatman
>> state: FAULTED
>> status: One or more devices could not be used because the label is missing
>>
>> or invalid. There are insufficient replicas for the pool to continue
>> functioning.
>> action: Destroy and re-create the pool from a backup source.
>> see: http://www.sun.com/msg/ZFS-8000-5E
>> scrub: none requested
>> config:
>>
>> NAME STATE READ WRITE CKSUM
>> fatman FAULTED 0 0 1 corrupted data
>> raidz2 ONLINE 0 0 6
>> da2 UNAVAIL 0 0 0 corrupted data
>> ad4 ONLINE 0 0 0
>> ad6 ONLINE 0 0 0
>> ad20 ONLINE 0 0 0
>> ad22 ONLINE 0 0 0
>> ad17 ONLINE 0 0 0
>> da3 ONLINE 0 0 0
>> ad10 ONLINE 0 0 0
>> ad16 ONLINE 0 0 0
>>
>> Issue looks very similar to this (JMR's issue) :
>> http://freebsd.monkey.org/freebsd-fs/200902/msg00017.html
>>
>> I've tried the methods there without much result.
>>
>> Using JMR's patches/debugs to see what is going on, this is what I got:
>>
>> JMR: vdev_uberblock_load_done ubbest ub_txg=46488653
>> ub_timestamp=1255246834
>> JMR: vdev_uberblock_load_done ub_txg=46475459 ub_timestamp=1255231780
>> JMR: vdev_uberblock_load_done ubbest ub_txg=46488653
>> ub_timestamp=1255246834
>> JMR: vdev_uberblock_load_done ub_txg=46475458 ub_timestamp=1255231750
>> JMR: vdev_uberblock_load_done ubbest ub_txg=46488653
>> ub_timestamp=1255246834
>> JMR: vdev_uberblock_load_done ub_txg=46481473 ub_timestamp=1255234263
>> JMR: vdev_uberblock_load_done ubbest ub_txg=46488653
>> ub_timestamp=1255246834
>> JMR: vdev_uberblock_load_done ub_txg=46481472 ub_timestamp=1255234263
>> JMR: vdev_uberblock_load_done ubbest ub_txg=46488653
>> ub_timestamp=1255246834
>>
>> But JMR's patch still doesn't let me import even with a decremented txg
>>
>> I then had a look around the drives using zdb and some dirty script:
>>
>> [root at potjie /dev]# ls /dev/ad* /dev/da2 /dev/da3 | awk '{print "echo
>> "$1";zdb -l "$1" |grep txg"}' | sh
>> /dev/ad10
>> txg=46488654
>> txg=46488654
>> txg=46488654
>> txg=46488654
>> /dev/ad16
>> txg=46408223 <- old TXGid ?
>> txg=46408223
>> txg=46408223
>> txg=46408223
>> /dev/ad17
>> txg=46408223 <- old TXGid ?
>> txg=46408223
>> txg=46408223
>> txg=46408223
>> /dev/ad18 (ssd)
>> /dev/ad19 (spare drive, removed from pool some time ago)
>> txg=0
>> create_txg=0
>> txg=0
>> create_txg=0
>> txg=0
>> create_txg=0
>> txg=0
>> create_txg=0
>> /dev/ad20
>> txg=46488654
>> txg=46488654
>> txg=46488654
>> txg=46488654
>> /dev/ad22
>> txg=46488654
>> txg=46488654
>> txg=46488654
>> txg=46488654
>> /dev/ad4
>> txg=46488654
>> txg=46488654
>> txg=46488654
>> txg=46488654
>> /dev/ad6
>> txg=46488654
>> txg=46488654
>> txg=46488654
>> txg=46488654
>> /dev/da2 < new drive replaced broken da2
>> /dev/da3
>> txg=46488654
>> txg=46488654
>> txg=46488654
>> txg=46488654
>>
>> I did not see any checksums or other issues on ad16 and ad17 previously,
>> and I do check regularly.
>>
>> Any thoughts on what to try next ?
>>
>> Regards,
>>
>> Alex
>>
>>
>
More information about the freebsd-fs
mailing list