ZFS weird issue...

Sun Dec 7 10:32:46 UTC 2014

Will Andrews wrote:
> On Fri, Dec 5, 2014 at 6:40 PM, Michelle Sullivan <michelle at sorbs.net> wrote:
>   
>> Days later new drive to replace the dead drive arrived and was
>> inserted.  System refused to re-add as there was data in the cache, so
>> rebooted and cleared the cache (as per many on web faq's)  Reconfigured
>> it to match the others.  Can't do a zpool replace mfid8 because that's
>> already in the pool... (was mfid9) can't use mfid15 because zpool
>> reports it's not part of the config... can't use the uniq-id it received
>> (can't find vdev) ... HELP!! :)
>>     
> [...]
>   
>> root at colossus:~ # zpool status -v
>>     
> [...]
>   
>>   pool: sorbs
>>  state: DEGRADED
>> status: One or more devices could not be opened.  Sufficient replicas
>> exist for
>>     the pool to continue functioning in a degraded state.
>> action: Attach the missing device and online it using 'zpool online'.
>>    see: http://illumos.org/msg/ZFS-8000-2Q
>>   scan: scrub in progress since Fri Dec  5 17:11:29 2014
>>         2.51T scanned out of 29.9T at 89.4M/s, 89h7m to go
>>         0 repaired, 8.40% done
>> config:
>>
>>     NAME              STATE     READ WRITE CKSUM
>>     sorbs             DEGRADED     0     0     0
>>       raidz2-0        DEGRADED     0     0     0
>>         mfid0         ONLINE       0     0     0
>>         mfid1         ONLINE       0     0     0
>>         mfid2         ONLINE       0     0     0
>>         mfid3         ONLINE       0     0     0
>>         mfid4         ONLINE       0     0     0
>>         mfid5         ONLINE       0     0     0
>>         mfid6         ONLINE       0     0     0
>>         mfid7         ONLINE       0     0     0
>>         spare-8       DEGRADED     0     0     0
>>           1702922605  UNAVAIL      0     0     0  was /dev/mfid8
>>           mfid14      ONLINE       0     0     0
>>         mfid8         ONLINE       0     0     0
>>         mfid9         ONLINE       0     0     0
>>         mfid10        ONLINE       0     0     0
>>         mfid11        ONLINE       0     0     0
>>         mfid12        ONLINE       0     0     0
>>         mfid13        ONLINE       0     0     0
>>     spares
>>       933862663       INUSE     was /dev/mfid14
>>
>> errors: No known data errors
>> root at colossus:~ # uname -a
>> FreeBSD colossus.sorbs.net 9.2-RELEASE FreeBSD 9.2-RELEASE #0 r255898:
>> Thu Sep 26 22:50:31 UTC 2013
>> root at bake.isc.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
>>     
> [...]
>   
>> root at colossus:~ # ls -l /dev/mfi*
>> crw-r-----  1 root  operator  0x22 Dec  5 17:18 /dev/mfi0
>> crw-r-----  1 root  operator  0x68 Dec  5 17:18 /dev/mfid0
>> crw-r-----  1 root  operator  0x69 Dec  5 17:18 /dev/mfid1
>> crw-r-----  1 root  operator  0x78 Dec  5 17:18 /dev/mfid10
>> crw-r-----  1 root  operator  0x79 Dec  5 17:18 /dev/mfid11
>> crw-r-----  1 root  operator  0x7a Dec  5 17:18 /dev/mfid12
>> crw-r-----  1 root  operator  0x82 Dec  5 17:18 /dev/mfid13
>> crw-r-----  1 root  operator  0x83 Dec  5 17:18 /dev/mfid14
>> crw-r-----  1 root  operator  0x84 Dec  5 17:18 /dev/mfid15
>> crw-r-----  1 root  operator  0x6a Dec  5 17:18 /dev/mfid2
>> crw-r-----  1 root  operator  0x6b Dec  5 17:18 /dev/mfid3
>> crw-r-----  1 root  operator  0x6c Dec  5 17:18 /dev/mfid4
>> crw-r-----  1 root  operator  0x6d Dec  5 17:18 /dev/mfid5
>> crw-r-----  1 root  operator  0x6e Dec  5 17:18 /dev/mfid6
>> crw-r-----  1 root  operator  0x75 Dec  5 17:18 /dev/mfid7
>> crw-r-----  1 root  operator  0x76 Dec  5 17:18 /dev/mfid8
>> crw-r-----  1 root  operator  0x77 Dec  5 17:18 /dev/mfid9
>> root at colossus:~ #
>>     
>
> Hi,
>
> From the above it appears your replacement drive's current name is
> mfid15, and the spare is now mfid14.
>   

No, I think LD8 was re-created but nothing was re-numbered... the
following seems to confirm that (if I'm reading it right.)
> What commands did you run that failed?  Can you provide a copy of the
> first label from 'zdb -l /dev/mfid0'?
>   
root at colossus:~ # zdb -l /dev/mfid0
--------------------------------------------
LABEL 0
--------------------------------------------
    version: 5000
    name: 'sorbs'
    state: 0
    txg: 979499
    pool_guid: 1038563320
    hostid: 339509314
    hostname: 'colossus.sorbs.net'
    top_guid: 386636424
    guid: 2060345993
    vdev_children: 1
    vdev_tree:
        type: 'raidz'
        id: 0
        guid: 386636424
        nparity: 2
        metaslab_array: 33
        metaslab_shift: 38
        ashift: 9
        asize: 45000449064960
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 2060345993
            path: '/dev/mfid0'
            phys_path: '/dev/mfid0'
            whole_disk: 1
            DTL: 154
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 61296476
            path: '/dev/mfid1'
            phys_path: '/dev/mfid1'
            whole_disk: 1
            DTL: 153
            create_txg: 4
        children[2]:
            type: 'disk'
            id: 2
            guid: 1565205219
            path: '/dev/mfid2'
            phys_path: '/dev/mfid2'
            whole_disk: 1
            DTL: 152
            create_txg: 4
        children[3]:
            type: 'disk'
            id: 3
            guid: 1876923630
            path: '/dev/mfid3'
            phys_path: '/dev/mfid3'
            whole_disk: 1
            DTL: 151
            create_txg: 4
        children[4]:
            type: 'disk'
            id: 4
            guid: 1068158627
            path: '/dev/mfid4'
            phys_path: '/dev/mfid4'
            whole_disk: 1
            DTL: 150
            create_txg: 4
        children[5]:
            type: 'disk'
            id: 5
            guid: 1726238716
            path: '/dev/mfid5'
            phys_path: '/dev/mfid5'
            whole_disk: 1
            DTL: 149
            create_txg: 4
        children[6]:
            type: 'disk'
            id: 6
            guid: 390028842
            path: '/dev/mfid6'
            phys_path: '/dev/mfid6'
            whole_disk: 1
            DTL: 148
            create_txg: 4
        children[7]:
            type: 'disk'
            id: 7
            guid: 1094656850
            path: '/dev/mfid7'
            phys_path: '/dev/mfid7'
            whole_disk: 1
            DTL: 147
            create_txg: 4
        children[8]:
            type: 'spare'
            id: 8
            guid: 1773868765
            whole_disk: 0
            create_txg: 4
            children[0]:
                type: 'disk'
                id: 0
                guid: 1702922605
                path: '/dev/mfid8'
                phys_path: '/dev/mfid8'
                whole_disk: 1
                DTL: 166
                create_txg: 4
            children[1]:
                type: 'disk'
                id: 1
                guid: 933862663
                path: '/dev/mfid14'
                phys_path: '/dev/mfid14'
                whole_disk: 1
                is_spare: 1
                DTL: 146
                create_txg: 4
                resilvering: 1
        children[9]:
            type: 'disk'
            id: 9
            guid: 1771170870
            path: '/dev/mfid8'
            phys_path: '/dev/mfid8'
            whole_disk: 1
            DTL: 145
            create_txg: 4
        children[10]:
            type: 'disk'
            id: 10
            guid: 1797981023
            path: '/dev/mfid9'
            phys_path: '/dev/mfid9'
            whole_disk: 1
            DTL: 144
            create_txg: 4
        children[11]:
            type: 'disk'
            id: 11
            guid: 1424656624
            path: '/dev/mfid10'
            phys_path: '/dev/mfid10'
            whole_disk: 1
            DTL: 143
            create_txg: 4
        children[12]:
            type: 'disk'
            id: 12
            guid: 1908699165
            path: '/dev/mfid11'
            phys_path: '/dev/mfid11'
            whole_disk: 1
            DTL: 142
            create_txg: 4
        children[13]:
            type: 'disk'
            id: 13
            guid: 396147269
            path: '/dev/mfid12'
            phys_path: '/dev/mfid12'
            whole_disk: 1
            DTL: 141
            create_txg: 4
        children[14]:
            type: 'disk'
            id: 14
            guid: 847844383
            path: '/dev/mfid13'
            phys_path: '/dev/mfid13'
            whole_disk: 1
            DTL: 140
            create_txg: 4
    features_for_read:


> The label will provide you with the full vdev guid that you need to
> replace the original drive with a new one.
>
> Another thing you could do is wait for the spare to finish
> resilvering, then promote it to replace the original drive, and make
> your new one a spare.  Considering the time required to resilver this
> pool configuration, that may be preferable for you.
>
> --Will.
>   
2 physical paths of mfid8 ... that can't be good...  can't seem to use
guids.

Michelle

-- 
Michelle Sullivan
http://www.mhix.org/