Re: ZFS replace a mirrored disk
- In reply to: Christos Chatzaras : "ZFS replace a mirrored disk"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 11 May 2022 19:24:39 UTC
On 5/11/22 03:22, Christos Chatzaras wrote:
> When a disk fails and want to replace it with a NEW disk are these commands correct?
>
> ----
> gpart backup nvd1 | gpart restore -F nvd0
> gmirror forget swap
> gmirror insert swap /dev/nvd0p2
> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 nvd0
> zpool replace zroot nvd0
> ----
>
>
> I try to simulate a disk "failure". I tried the above commands with the SAME disk without success. I believe the "issue" is because the "new" disk is the SAME as before.
>
> I did these steps:
>
> 1) Boot with mfsBSD and did "gpart destroy -F nvd0"
> 2) Reboot the server in the main OS.
>
> If the "new" disk is the SAME as before do I have to change the commands to these?
>
> ----
> gpart backup nvd1 | gpart restore -F nvd0
> gmirror forget swap
> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 nvd0
> zpool offline zroot nvd0
> zpool online zroot nvd0
> ----
>
>
> Also I notice that with the SAME disk "gmirror forget swap" starts rebuilding swap immediately and "gmirror insert swap /dev/nvd0p2" is not needed. Is this the correct behaviour when the "new" disk is the SAME?
If you are using ZFS, you should use ZFS commands to replace drives.
I had a bad HDD in a ZFS pool about two years ago:
2020-03-01 15:55:19 toor@f3 ~
# zpool status p3
pool: p3
state: DEGRADED
status: One or more devices could not be opened. Sufficient
replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool
online'.
see: http://illumos.org/msg/ZFS-8000-2Q
scan: none requested
config:
NAME STATE READ WRITE CKSUM
p3 DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
4744083090086529196 UNAVAIL 0 0 0
was /dev/gpt/p3a.eli
gpt/p3b.eli ONLINE 0 0 0
errors: No known data errors
I physically removed the bad HDD, installed a new HDD, created a GPT
partition table, added one large partition, initialized a GELI provider,
saved the GELI metadata backup, edited /etc/rc.conf with the revised
GELI settings, attached the new GELI provider, and then used zpool(8) to
replace the old provider with the new provider:
2020-03-01 15:59:48 toor@f3 ~
# zpool replace p3 4744083090086529196 gpt/p3c.eli
2020-03-01 16:01:12 toor@f3 ~
# zpool status p3
pool: p3
state: DEGRADED
status: One or more devices is currently being resilvered. The
pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Sun Mar 1 16:01:08 2020
2.64G scanned at 159M/s, 0 issued at 0/s, 1.71T total
0 resilvered, 0.00% done, no estimated completion time
config:
NAME STATE READ WRITE CKSUM
p3 DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
replacing-0 UNAVAIL 0 0 0
4744083090086529196 UNAVAIL 0 0 0
was /dev/gpt/p3a.eli
gpt/p3c.eli ONLINE 0 0 0
gpt/p3b.eli ONLINE 0 0 0
errors: No known data errors
A few weeks later, I scrubbed the pool:
2020-03-26 23:09:16 toor@f3 ~
# zpool scrub p3
This was the pool status when the scrub finished:
2020-03-27 12:52:27 toor@f3 ~
# zpool status p3
pool: p3
state: ONLINE
scan: scrub repaired 0 in 0 days 04:42:14 with 0 errors on
Fri Mar 27 03:51:36 2020
config:
NAME STATE READ WRITE CKSUM
p3 ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gpt/p3c.eli ONLINE 0 0 0
gpt/p3b.eli ONLINE 0 0 0
errors: No known data errors
David