Problem with ZFSD when replacing a failed disk
Jean-Marc LACROIX
Jean-Marc.Lacroix at unice.fr
Thu Oct 4 15:32:55 UTC 2018
Hello,
we encounter a problem on our storage solution which is constituted of
1 R630 + 3 JOBS MD1420 sas connected.(DELL)
Our system is : 11.0-RELEASE-p9
We use ZFSD to manage Spare disks
The problem appear when we have to change a failed disk as explained below.
We don't understand why the second spare is activated when the replace
command is done;
Thanks in advance for your help.
Regards
JM
root at math12:/ # zpool status
pool: zpool
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas
exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://illumos.org/msg/ZFS-8000-2Q
scan: resilvered 238G in 11h3m with 0 errors on Sat Sep 29 19:47:28 2018
config:
NAME STATE READ WRITE CKSUM
zpool DEGRADED 0 0 0
raidz2-0 ONLINE 0 0 0
label/e0s0 ONLINE 0 0 0
label/e1s0 ONLINE 0 0 0
label/e2s0 ONLINE 0 0 0
label/e0s1 ONLINE 0 0 0
label/e1s1 ONLINE 0 0 0
label/e2s1 ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
label/e0s2 ONLINE 0 0 0
label/e1s2 ONLINE 0 0 0
label/e2s2 ONLINE 0 0 0
label/e0s3 ONLINE 0 0 0
label/e1s3 ONLINE 0 0 0
label/e2s3 ONLINE 0 0 0
raidz2-2 ONLINE 0 0 0
label/e0s4 ONLINE 0 0 0
label/e1s4 ONLINE 0 0 0
label/e2s4 ONLINE 0 0 0
label/e0s5 ONLINE 0 0 0
label/e1s5 ONLINE 0 0 0
label/e2s5 ONLINE 0 0 0
raidz2-3 ONLINE 0 0 0
label/e0s6 ONLINE 0 0 0
label/e1s6 ONLINE 0 0 0
label/e2s6 ONLINE 0 0 0
label/e0s7 ONLINE 0 0 0
label/e1s7 ONLINE 0 0 0
label/e2s7 ONLINE 0 0 0
raidz2-4 ONLINE 0 0 0
label/e0s8 ONLINE 0 0 0
label/e1s8 ONLINE 0 0 0
label/e2s8 ONLINE 0 0 0
label/e0s9 ONLINE 0 0 0
label/e1s9 ONLINE 0 0 0
label/e2s9 ONLINE 0 0 0
raidz2-5 DEGRADED 0 0 0
label/e0s10 ONLINE 0 0 0
label/e1s10 ONLINE 0 0 0
label/e2s10 ONLINE 0 0 0
label/e0s11 ONLINE 0 0 0
spare-4 UNAVAIL 0 0 0
9796387366129075446 UNAVAIL 0 0 0 was /dev/label/e1s11
label/spare0 ONLINE 0 0 0
label/e2s11 ONLINE 0 0 0
raidz2-6 ONLINE 0 0 0
label/e0s12 ONLINE 0 0 0
label/e1s12 ONLINE 0 0 0
label/e2s12 ONLINE 0 0 0
label/e0s13 ONLINE 0 0 0
label/e1s13 ONLINE 0 0 0
label/e2s13 ONLINE 0 0 0
raidz2-7 ONLINE 0 0 0
label/e0s14 ONLINE 0 0 0
label/e1s14 ONLINE 0 0 0
label/e2s14 ONLINE 0 0 0
label/e0s15 ONLINE 0 0 0
label/e1s15 ONLINE 0 0 0
label/e2s15 ONLINE 0 0 0
raidz2-8 ONLINE 0 0 0
label/e0s16 ONLINE 0 0 0
label/e1s16 ONLINE 0 0 0
label/e2s16 ONLINE 0 0 0
label/e0s17 ONLINE 0 0 0
label/e1s17 ONLINE 0 0 0
label/e2s17 ONLINE 0 0 0
logs
mirror-9 ONLINE 0 0 0
da2 ONLINE 0 0 0
da3 ONLINE 0 0 0
spares
12553822586401141982 INUSE was /dev/label/spare0
label/spare1 AVAIL
errors: No known data errors
===========================================================================
STEPS DONE:
1) unplug the blinking UNAVAIL disk
2) plug the new disk in the free slot
=> the system console shows the new device name:
ugen0.4: <Avocent> at usbus0
umass0: <SCSI Transparent Interface 0> on usbus0
umass0: SCSI over Bulk-Only; quirks = 0x4100
umass0:16:0: Attached to scbus16
da59 at umass-sim0 bus 0 scbus16 target 0 lun 0
da59: <iDRAC DRACRW 0329> Removable Direct Access SCSI device
da59: 40.000MB/s transfers
da59: 308MB (630784 512 byte sectors)
da59: quirks=0x2<NO_6_BYTE>
ugen0.4: <Avocent> at usbus0 (disconnected)
umass0: at uhub4, port 2, addr 4 (disconnected)
da59 at umass-sim0 bus 0 scbus16 target 0 lun 0
da59: <iDRAC DRACRW 0329> detached
(da59:umass-sim0:0:0:0): Periph destroyed
(da59:mrsas1:1:61:0): UNMAPPED
da59 at mrsas1 bus 1 scbus3 target 61 lun 0
da59: <SEAGATE ST91000640SS AS0B> Fixed Direct Access SPC-4 SCSI device
da59: Serial Number 9XG9RH37
da59: 150.000MB/s transfers
da59: 953869MB (1953525168 512 byte sectors)
glabel label e1s11 /dev/da59
glabel status da59
=> the disk is correctly labeled
zpool replace zpool 9796387366129075446 label/e1s11
zpool status
=> PROBLEM DESCRIBED BELOW
==========================================================================
AFTER the replace command, we can see that the second hotspare has been
activated
as follow:
root at math12:/ # zpool status
pool: zpool
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Thu Oct 4 15:39:49 2018
214G scanned out of 12.6T at 191M/s, 18h49m to go
7.89G resilvered, 1.66% done
config:
NAME STATE READ WRITE CKSUM
zpool DEGRADED 0 0 0
raidz2-0 ONLINE 0 0 0
label/e0s0 ONLINE 0 0 0
label/e1s0 ONLINE 0 0 0
label/e2s0 ONLINE 0 0 0
label/e0s1 ONLINE 0 0 0
label/e1s1 ONLINE 0 0 0
label/e2s1 ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
label/e0s2 ONLINE 0 0 0
label/e1s2 ONLINE 0 0 0
label/e2s2 ONLINE 0 0 0
label/e0s3 ONLINE 0 0 0
label/e1s3 ONLINE 0 0 0
label/e2s3 ONLINE 0 0 0
raidz2-2 ONLINE 0 0 0
label/e0s4 ONLINE 0 0 0
label/e1s4 ONLINE 0 0 0
label/e2s4 ONLINE 0 0 0
label/e0s5 ONLINE 0 0 0
label/e1s5 ONLINE 0 0 0
label/e2s5 ONLINE 0 0 0
raidz2-3 ONLINE 0 0 0
label/e0s6 ONLINE 0 0 0
label/e1s6 ONLINE 0 0 0
label/e2s6 ONLINE 0 0 0
label/e0s7 ONLINE 0 0 0
label/e1s7 ONLINE 0 0 0
label/e2s7 ONLINE 0 0 0
raidz2-4 ONLINE 0 0 0
label/e0s8 ONLINE 0 0 0
label/e1s8 ONLINE 0 0 0
label/e2s8 ONLINE 0 0 0
label/e0s9 ONLINE 0 0 0
label/e1s9 ONLINE 0 0 0
label/e2s9 ONLINE 0 0 0
raidz2-5 DEGRADED 0 0 0
label/e0s10 ONLINE 0 0 0
label/e1s10 ONLINE 0 0 0
label/e2s10 ONLINE 0 0 0
label/e0s11 ONLINE 0 0 0
spare-4 UNAVAIL 0 0 0
replacing-0 UNAVAIL 0 0 0
spare-0 UNAVAIL 0 0 0
9796387366129075446 UNAVAIL 0 0 0 was
/dev/label/e1s11/old
label/spare1 ONLINE 0 0 0 (resilvering)
label/e1s11 ONLINE 0 0 0 (resilvering)
label/spare0 ONLINE 0 0 0
label/e2s11 ONLINE 0 0 0
raidz2-6 ONLINE 0 0 0
label/e0s12 ONLINE 0 0 0
label/e1s12 ONLINE 0 0 0
label/e2s12 ONLINE 0 0 0
label/e0s13 ONLINE 0 0 0
label/e1s13 ONLINE 0 0 0
label/e2s13 ONLINE 0 0 0
raidz2-7 ONLINE 0 0 0
label/e0s14 ONLINE 0 0 0
label/e1s14 ONLINE 0 0 0
label/e2s14 ONLINE 0 0 0
label/e0s15 ONLINE 0 0 0
label/e1s15 ONLINE 0 0 0
label/e2s15 ONLINE 0 0 0
raidz2-8 ONLINE 0 0 0
label/e0s16 ONLINE 0 0 0
label/e1s16 ONLINE 0 0 0
label/e2s16 ONLINE 0 0 0
label/e0s17 ONLINE 0 0 0
label/e1s17 ONLINE 0 0 0
label/e2s17 ONLINE 0 0 0
logs
mirror-9 ONLINE 0 0 0
da2 ONLINE 0 0 0
da3 ONLINE 0 0 0
spares
12553822586401141982 INUSE was /dev/label/spare0
15637882846021217179 INUSE was /dev/label/spare1
errors: No known data errors
More information about the freebsd-fs
mailing list