zraid2 loses a single disk and becomes difficult to recover

Sun Oct 11 16:27:22 UTC 2009

Well after trying alot of things (zpool import with or without cache file in
place, etc), it randomly managed to mount the pool up, atleast, with errors
- :

zfs list output:
cannot iterate filesystems: I/O error
NAME                      USED  AVAIL  REFER  MOUNTPOINT
fatman                   1.40T  1.70T  51.2K  /fatman
fatman/backup             100G  99.5G  95.5G  /fatman/backup
fatman/jail               422G  1.70T  60.5K  /fatman/jail
fatman/jail/havnor        198G  51.7G   112G  /fatman/jail/havnor
fatman/jail/mail         19.4G  30.6G  13.0G  /fatman/jail/mail
fatman/jail/syndicate    16.6G   103G  10.5G  /fatman/jail/syndicate
fatman/jail/thirdforces   159G  41.4G  78.1G  /fatman/jail/thirdforces
fatman/jail/web          24.8G  25.2G  22.3G  /fatman/jail/web
fatman/stash              913G  1.70T   913G  /fatman/stash

(end of the dmesg)
JMR: vdev_uberblock_load_done ubbest ub_txg=46475461 ub_timestamp=1255231841
JMR: vdev_uberblock_load_done ub_txg=46481476 ub_timestamp=1255234263
JMR: vdev_uberblock_load_done ubbest ub_txg=46481476 ub_timestamp=1255234263
JMR: vdev_uberblock_load_done ubbest ub_txg=46475459 ub_timestamp=1255231780
JMR: vdev_uberblock_load_done ubbest ub_txg=46475458 ub_timestamp=1255231750
JMR: vdev_uberblock_load_done ub_txg=46481473 ub_timestamp=1255234263
JMR: vdev_uberblock_load_done ubbest ub_txg=46481473 ub_timestamp=1255234263
JMR: vdev_uberblock_load_done ubbest ub_txg=46481472 ub_timestamp=1255234263
Solaris: WARNING: can't open objset for fatman/jail/margaret
Solaris: WARNING: can't open objset for fatman/jail/margaret
Solaris: WARNING: ZFS replay transaction error 86, dataset
fatman/jail/havnor, seq 0x25442, txtype 9

Solaris: WARNING: ZFS replay transaction error 86, dataset fatman/jail/mail,
seq 0x1e200, txtype 9

Solaris: WARNING: ZFS replay transaction error 86, dataset
fatman/jail/thirdforces, seq 0x732e3, txtype 9

[root at potjie /fatman/jail/mail]# zpool status -v
  pool: fatman
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
    entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver in progress for 0h4m, 0.83% done, 8h21m to go
config:

    NAME           STATE     READ WRITE CKSUM
    fatman         DEGRADED     0     0    34
      raidz2       DEGRADED     0     0   384
        replacing  DEGRADED     0     0     0
          da2/old  REMOVED      0    24     0
          da2      ONLINE       0     0     0  1.71G resilvered
        ad4        ONLINE       0     0     0  21.3M resilvered
        ad6        ONLINE       0     0     0  21.4M resilvered
        ad20       ONLINE       0     0     0  21.3M resilvered
        ad22       ONLINE       0     0     0  21.3M resilvered
        ad17       ONLINE       0     0     0  21.3M resilvered
        da3        ONLINE       0     0     0  21.3M resilvered
        ad10       ONLINE       0     0     1  21.4M resilvered
        ad16       ONLINE       0     0     0  21.2M resilvered
    cache
      ad18         ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        fatman/jail/margaret:<0x0>
        fatman/jail/syndicate:<0x0>
        fatman/jail/mail:<0x0>
        /fatman/jail/mail/tmp
        fatman/jail/havnor:<0x0>
        fatman/jail/thirdforces:<0x0>
        fatman/backup:<0x0>

jail/margaret & backup isn't showing up in zfs list
jail/syndicate is showing up but isn't viewable

It seems the latest content on the better-looking zfs filesystems are quite
recent.

Any thoughts about what is going on ?

I have snapshots for africa on these zfs filesystems - any suggestions on
trying to get them back ?

--
Alex

2009/10/11 Alex Trull <alextzfs at googlemail.com>

> Hi All,
>
> My zraid2 has broken this morning on releng_7 zfs13.
>
> System failed this morning and came back without pool - having lost a disk.
>
> This is how I found the system:
>
>   pool: fatman
>  state: FAULTED
> status: One or more devices could not be used because the label is missing
>     or invalid.  There are insufficient replicas for the pool to continue
>     functioning.
> action: Destroy and re-create the pool from a backup source.
>    see: http://www.sun.com/msg/ZFS-8000-5E
>  scrub: none requested
> config:
>
>     NAME        STATE     READ WRITE CKSUM
>     fatman      FAULTED      0     0     1  corrupted data
>       raidz2    DEGRADED     0     0     6
>         da2     FAULTED      0     0     0  corrupted data
>         ad4     ONLINE       0     0     0
>         ad6     ONLINE       0     0     0
>         ad20    ONLINE       0     0     0
>         ad22    ONLINE       0     0     0
>         ad17    ONLINE       0     0     0
>         da2     ONLINE       0     0     0
>         ad10    ONLINE       0     0     0
>         ad16    ONLINE       0     0     0
>
> Initialy it complained that da3 had gone to da2 (da2 had failed and was no
> longer seen)
>
> I replaced the original da2 and bumped what was originaly da3 back up to
> da3 using the controllers ordering.
>
> [root at potjie /dev]# zpool status
>   pool: fatman
>  state: FAULTED
> status: One or more devices could not be used because the label is missing
>     or invalid.  There are insufficient replicas for the pool to continue
>     functioning.
> action: Destroy and re-create the pool from a backup source.
>    see: http://www.sun.com/msg/ZFS-8000-5E
>  scrub: none requested
> config:
>
>     NAME        STATE     READ WRITE CKSUM
>     fatman      FAULTED      0     0     1  corrupted data
>       raidz2    ONLINE       0     0     6
>         da2     UNAVAIL      0     0     0  corrupted data
>         ad4     ONLINE       0     0     0
>         ad6     ONLINE       0     0     0
>         ad20    ONLINE       0     0     0
>         ad22    ONLINE       0     0     0
>         ad17    ONLINE       0     0     0
>         da3     ONLINE       0     0     0
>         ad10    ONLINE       0     0     0
>         ad16    ONLINE       0     0     0
>
> Issue looks very similar to this (JMR's issue) :
> http://freebsd.monkey.org/freebsd-fs/200902/msg00017.html
>
> I've tried the methods there without much result.
>
> Using JMR's patches/debugs to see what is going on, this is what I got:
>
> JMR: vdev_uberblock_load_done ubbest ub_txg=46488653
> ub_timestamp=1255246834
> JMR: vdev_uberblock_load_done ub_txg=46475459 ub_timestamp=1255231780
> JMR: vdev_uberblock_load_done ubbest ub_txg=46488653
> ub_timestamp=1255246834
> JMR: vdev_uberblock_load_done ub_txg=46475458 ub_timestamp=1255231750
> JMR: vdev_uberblock_load_done ubbest ub_txg=46488653
> ub_timestamp=1255246834
> JMR: vdev_uberblock_load_done ub_txg=46481473 ub_timestamp=1255234263
> JMR: vdev_uberblock_load_done ubbest ub_txg=46488653
> ub_timestamp=1255246834
> JMR: vdev_uberblock_load_done ub_txg=46481472 ub_timestamp=1255234263
> JMR: vdev_uberblock_load_done ubbest ub_txg=46488653
> ub_timestamp=1255246834
>
> But JMR's patch still doesn't let me import even with a decremented txg
>
> I then had a look around the drives using zdb and some dirty script:
>
> [root at potjie /dev]# ls /dev/ad* /dev/da2 /dev/da3 | awk '{print "echo
> "$1";zdb -l "$1" |grep txg"}' | sh
> /dev/ad10
>     txg=46488654
>     txg=46488654
>     txg=46488654
>     txg=46488654
> /dev/ad16
>     txg=46408223 <- old TXGid ?
>     txg=46408223
>     txg=46408223
>     txg=46408223
> /dev/ad17
>     txg=46408223 <- old TXGid ?
>     txg=46408223
>     txg=46408223
>     txg=46408223
> /dev/ad18 (ssd)
> /dev/ad19 (spare drive, removed from pool some time ago)
>     txg=0
>     create_txg=0
>     txg=0
>     create_txg=0
>     txg=0
>     create_txg=0
>     txg=0
>     create_txg=0
> /dev/ad20
>     txg=46488654
>     txg=46488654
>     txg=46488654
>     txg=46488654
> /dev/ad22
>     txg=46488654
>     txg=46488654
>     txg=46488654
>     txg=46488654
> /dev/ad4
>     txg=46488654
>     txg=46488654
>     txg=46488654
>     txg=46488654
> /dev/ad6
>     txg=46488654
>     txg=46488654
>     txg=46488654
>     txg=46488654
> /dev/da2 < new drive replaced broken da2
> /dev/da3
>     txg=46488654
>     txg=46488654
>     txg=46488654
>     txg=46488654
>
> I did not see any checksums or other issues on ad16 and ad17 previously,
> and I do check regularly.
>
> Any thoughts on what to try next ?
>
> Regards,
>
> Alex
>
>