Degraded zpool cannot detach old/bad drive
Rumen Telbizov
telbizov at gmail.com
Fri Oct 29 18:34:44 UTC 2010
Hi Artem, everyone,
Thanks once again for your feedback and help.
Here's more information.
# zpool export tank
# ls /dev/gpt
disk-e1:s10 disk-e1:s11 disk-e1:s12 disk-e1:s13
disk-e1:s14 disk-e1:s15 disk-e1:s16 disk-e1:s18
disk-e1:s19 disk-e1:s20 disk-e1:s21 disk-e1:s22
disk-e1:s23 disk-e1:s3 disk-e1:s4 disk-e1:s5
disk-e1:s6 disk-e1:s7 disk-e1:s8 disk-e1:s9
disk-e2:s0 disk-e2:s1 disk-e2:s10 disk-e2:s11
disk-e2:s2 disk-e2:s3 disk-e2:s4 disk-e2:s5
disk-e2:s6 disk-e2:s7 disk-e2:s8 disk-e2:s9
newdisk-e1:s17
newdisk-e1:s2
All the disks are here! Same for /dev/gptid/. Now importing the pool back
like you suggested:
# zpool import -d /dev/gpt
pool: tank
id: 13504509992978610301
state: UNAVAIL
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
see: http://www.sun.com/msg/ZFS-8000-5E
config:
tank UNAVAIL insufficient replicas
raidz1 ONLINE
gpt/disk-e1:s10 ONLINE
mfid9p1 ONLINE
mfid10p1 ONLINE
mfid11p1 ONLINE
It's missing a ton of drives. kern.geom.label.gptid.enable=0 makes no
difference either
And if I import it normally I get the same result as before. The pool is
imported OK but
with most of the disks referred to as mfidXXX instead of /dev/gpt/disk-XX
and here's what I have left:
# ls /dev/gpt
disk-e1:s10 disk-e1:s20 disk-e2:s0
The problem I think comes down to what I have written in the zpool.cache
file.
It stores the mfid path instead of the gpt/disk one.
children[0]
type='disk'
id=0
guid=1641394056824955485
* path='/dev/mfid33p1'*
* phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0/sd at 1,0:a'*
whole_disk=0
* DTL=55*
*
*
Compared to a disk from a partner server which is fine:
children[0]
type='disk'
id=0
guid=5513814503830705577
path='/dev/gpt/disk-e1:s6'
whole_disk=0
*
*
*I suspect OpenSolaris overwrote that part. So I wonder if there's way to
actually*
*edit the /boot/zfs/zpool.cache file and replace path with the corresponding
/dev/gpt*
*entry and remove the **phys_path **one. I don't know about DTL? Is there a
way *
to do this and how stupid that idea sounds to you? They should still point
to the same
data after all?
*
I cannot find a good zdb tutorial so this
is what I've got for now:
*
# zdb
tank
version=14
name='tank'
state=0
txg=206266
pool_guid=13504509992978610301
hostid=409325918
hostname='XXXX'
vdev_tree
type='root'
id=0
guid=13504509992978610301
children[0]
type='raidz'
id=0
guid=3740854890192825394
nparity=1
metaslab_array=33
metaslab_shift=36
ashift=9
asize=7995163410432
is_log=0
children[0]
type='disk'
id=0
guid=1641394056824955485
* path='/dev/mfid33p1'*
* phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 1,0:a'*
whole_disk=0
* DTL=55*
children[1]
type='disk'
id=1
guid=6047192237176807561
path='/dev/mfid1p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 2,0:a'
whole_disk=0
DTL=250
children[2]
type='disk'
id=2
guid=9178318500891071208
path='/dev/mfid2p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 3,0:a'
whole_disk=0
DTL=249
children[3]
type='disk'
id=3
guid=2567999855746767831
path='/dev/mfid3p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 4,0:a'
whole_disk=0
DTL=248
children[1]
type='raidz'
id=1
guid=17097047310177793733
nparity=1
metaslab_array=31
metaslab_shift=36
ashift=9
asize=7995163410432
is_log=0
children[0]
type='disk'
id=0
guid=14513380297393196654
path='/dev/mfid4p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 5,0:a'
whole_disk=0
DTL=266
children[1]
type='disk'
id=1
guid=7673391645329839273
path='/dev/mfid5p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 6,0:a'
whole_disk=0
DTL=265
children[2]
type='disk'
id=2
guid=15189132305590412134
path='/dev/mfid6p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 7,0:a'
whole_disk=0
DTL=264
children[3]
type='disk'
id=3
guid=17171875527714022076
path='/dev/mfid7p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 8,0:a'
whole_disk=0
DTL=263
children[2]
type='raidz'
id=2
guid=4551002265962803186
nparity=1
metaslab_array=30
metaslab_shift=36
ashift=9
asize=7995163410432
is_log=0
children[0]
type='disk'
id=0
guid=12104241519484712161
path='/dev/gpt/disk-e1:s10'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 9,0:a'
whole_disk=0
DTL=262
children[1]
type='disk'
id=1
guid=3950210349623142325
path='/dev/mfid9p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at a,0:a'
whole_disk=0
DTL=261
children[2]
type='disk'
id=2
guid=14559903955698640085
path='/dev/mfid10p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at b,0:a'
whole_disk=0
DTL=260
children[3]
type='disk'
id=3
guid=12364155114844220066
path='/dev/mfid11p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at c,0:a'
whole_disk=0
DTL=259
children[3]
type='raidz'
id=3
guid=12517231224568010294
nparity=1
metaslab_array=29
metaslab_shift=36
ashift=9
asize=7995163410432
is_log=0
children[0]
type='disk'
id=0
guid=7655789038925330983
path='/dev/mfid12p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at d,0:a'
whole_disk=0
DTL=258
children[1]
type='disk'
id=1
guid=17815755378968233141
path='/dev/mfid13p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at e,0:a'
whole_disk=0
DTL=257
children[2]
type='disk'
id=2
guid=9590421681925673767
path='/dev/mfid14p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at f,0:a'
whole_disk=0
DTL=256
children[3]
type='disk'
id=3
guid=13312724999073057440
path='/dev/mfid34p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 10,0:a'
whole_disk=0
DTL=60
children[4]
type='raidz'
id=4
guid=7622366288306613136
nparity=1
metaslab_array=28
metaslab_shift=36
ashift=9
asize=7995163410432
is_log=0
children[0]
type='disk'
id=0
guid=11283483106921343963
path='/dev/mfid15p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 11,0:a'
whole_disk=0
DTL=254
children[1]
type='disk'
id=1
guid=14900597968455968576
path='/dev/mfid16p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 12,0:a'
whole_disk=0
DTL=253
children[2]
type='disk'
id=2
guid=4140592611852504513
path='/dev/gpt/disk-e1:s20'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 13,0:a'
whole_disk=0
DTL=252
children[3]
type='disk'
id=3
guid=2794215380207576975
path='/dev/mfid18p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 14,0:a'
whole_disk=0
DTL=251
children[5]
type='raidz'
id=5
guid=17655293908271300889
nparity=1
metaslab_array=27
metaslab_shift=36
ashift=9
asize=7995163410432
is_log=0
children[0]
type='disk'
id=0
guid=5274146379037055039
path='/dev/mfid19p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 15,0:a'
whole_disk=0
DTL=278
children[1]
type='disk'
id=1
guid=8651755019404873686
path='/dev/mfid20p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 16,0:a'
whole_disk=0
DTL=277
children[2]
type='disk'
id=2
guid=16827379661759988976
path='/dev/gpt/disk-e2:s0'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 17,0:a'
whole_disk=0
DTL=276
children[3]
type='disk'
id=3
guid=2524967151333933972
path='/dev/mfid22p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 18,0:a'
whole_disk=0
DTL=275
children[6]
type='raidz'
id=6
guid=2413519694016115220
nparity=1
metaslab_array=26
metaslab_shift=36
ashift=9
asize=7995163410432
is_log=0
children[0]
type='disk'
id=0
guid=16361968944335143412
path='/dev/mfid23p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 19,0:a'
whole_disk=0
DTL=274
children[1]
type='disk'
id=1
guid=10054650477559530937
path='/dev/mfid24p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 1a,0:a'
whole_disk=0
DTL=273
children[2]
type='disk'
id=2
guid=17105959045159531558
path='/dev/mfid25p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 1b,0:a'
whole_disk=0
DTL=272
children[3]
type='disk'
id=3
guid=17370453969371497663
path='/dev/mfid26p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 1c,0:a'
whole_disk=0
DTL=271
children[7]
type='raidz'
id=7
guid=4614010953103453823
nparity=1
metaslab_array=24
metaslab_shift=36
ashift=9
asize=7995163410432
is_log=0
children[0]
type='disk'
id=0
guid=10090128057592036175
path='/dev/mfid27p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 1d,0:a'
whole_disk=0
DTL=270
children[1]
type='disk'
id=1
guid=16676544025008223925
path='/dev/mfid28p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 1e,0:a'
whole_disk=0
DTL=269
children[2]
type='disk'
id=2
guid=11777789246954957292
path='/dev/mfid29p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 1f,0:a'
whole_disk=0
DTL=268
children[3]
type='disk'
id=3
guid=3406600121427522915
path='/dev/mfid30p1'
phys_path='/pci at 0,0/pci8086,3b42 at 1c/pci15d9,c480 at 0
/sd at 20,0:a'
whole_disk=0
DTL=267
Your help is highly appreciated.
Thanks you very much,
Rumen Telbizov
On Fri, Oct 29, 2010 at 12:26 AM, Artem Belevich <fbsdlist at src.cx> wrote:
> On Thu, Oct 28, 2010 at 10:51 PM, Rumen Telbizov <telbizov at gmail.com>
> wrote:
> > Hi Artem, everyone,
> >
> > Thanks for your quick response. Unfortunately I already did try this
> > approach.
> > Applying -d /dev/gpt only limits the pool to the bare three remaining
> disks
> > which turns
> > pool completely unusable (no mfid devices). Maybe those labels are
> removed
> > shortly
> > they are being tried to be imported/accessed?
>
> In one of the previous emails you've clearly listed many devices in
> /dev/gpt and said that they've disappeared after pool import.
> Did you do "zpool import -d /dev/gpt" while /dev/gpt entries were present?
>
> > What I don't understand is what exactly makes those gpt labels disappear
> > when the pool is imported and otherwise are just fine?!
>
> This is the way GEOM works. If something (ZFS in this case) uses raw
> device, derived GEOM entities disappear.
>
> Try exporting the pool. Your /dev/gpt entries should be back. Now try
> to import with -d option and see if it works.
>
> You may try bringing the labels back the hard way by detaching raw
> drive and then re-attaching it via the label, but resilvering one
> drive at a time will take a while.
>
> --Artem
>
--
Rumen Telbizov
http://telbizov.com
More information about the freebsd-stable
mailing list