Re: ZFS replace a mirrored disk
- Reply: David Christensen : "Re: ZFS replace a mirrored disk"
- In reply to: Julien Cigar : "Re: ZFS replace a mirrored disk"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 11 May 2022 22:18:03 UTC
> First please define "without success", what doesn't work?
>
> please paste output of:
>
> $> gpart show nvd1
>
> also, is it an UEFI system or classicla BIOS with GPT? What FreeBSD
> version?
>
> zpool replace zroot nvd0 is invalid, you should use:
>
> $> zpool replace zroot nvd1 nvd0 (but it uses the entire disk, which is
> probably incorrect too)
It's legacy BIOS with GPT.
What I want to do is "simulate" a disk failure and rebuild the RAID-1.
First I run these commands from the main OS:
------------------------
$> gpart show
=> 40 7501476448 nvd0 GPT (3.5T)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 33554432 2 freebsd-swap (16G)
33556480 7467919360 3 freebsd-zfs (3.5T)
7501475840 648 - free - (324K)
=> 40 7501476448 nvd1 GPT (3.5T)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 33554432 2 freebsd-swap (16G)
33556480 7467919360 3 freebsd-zfs (3.5T)
7501475840 648 - free - (324K)
$> zpool status
pool: zroot
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nvd0p3 ONLINE 0 0 0
nvd1p3 ONLINE 0 0 0
errors: No known data errors
------------------------------
Then I boot with mfsBSD and run this command to "simulate" a disk failure:
$> gpart destroy -F nvd0
------------------------------
Then I boot again in main OS and I run these commands:
$> zpool status
pool: zroot
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
config:
NAME STATE READ WRITE CKSUM
zroot DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
nvd0 UNAVAIL 0 0 0 invalid label
nvd1p3 ONLINE 0 0 0
errors: No known data errors
$> gmirror status
Name Status Components
mirror/swap DEGRADED nvd1p2 (ACTIVE)
------------------------------
Then I backup / restore the partitions:
$> gpart backup nvd1 | gpart restore -F nvd0
$> gpart show
=> 40 7501476448 nvd1 GPT (3.5T)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 33554432 2 freebsd-swap (16G)
33556480 7467919360 3 freebsd-zfs (3.5T)
7501475840 648 - free - (324K)
=> 40 7501476448 nvd0 GPT (3.5T)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 33554432 2 freebsd-swap (16G)
33556480 7467919360 3 freebsd-zfs (3.5T)
7501475840 648 - free - (324K)
------------------------------
Without doing a "gmirror forget swap" and "gmirror insert swap /dev/nvd0p2" I see that swap is already mirrored:
$> gmirror status
Name Status Components
mirror/swap COMPLETE nvd1p2 (ACTIVE)
nvd0p2 (ACTIVE)
So first question is if the swap is mirrored automatically because nvd0 is the same disk (not replaced by a new disk).
-------------------------------
Then I write the bootloader:
$> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 nvd0
--------------------------------
Then I want to add this disk to zpool but these commands don't do it:
$> zpool replace zroot nvd0
invalid vdev specification
use '-f' to override the following errors:
/dev/nvd0 is part of active pool 'zroot'
$> zpool replace -f zroot nvd0
invalid vdev specification
the following errors must be manually repaired:
/dev/nvd0 is part of active pool 'zroot'
-----------------------------------
Also these commands don't work:
$> zpool replace zroot nvd1 nvd0
invalid vdev specification
use '-f' to override the following errors:
/dev/nvd0 is part of active pool 'zroot'
$> zpool replace -f zroot nvd1 nvd0
invalid vdev specification
the following errors must be manually repaired:
/dev/nvd0 is part of active pool 'zroot'
-----------------------------------
Instead these commands work:
$> zpool offline zroot nvd0
zpool status
pool: zroot
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
config:
NAME STATE READ WRITE CKSUM
zroot DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
nvd0 OFFLINE 0 0 0
nvd1p3 ONLINE 0 0 0
errors: No known data errors
$> zpool online zroot nvd0
$> zpool status
pool: zroot
state: ONLINE
scan: resilvered 5.55M in 00:00:00 with 0 errors on Thu May 12 00:22:13 2022
config:
NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nvd0p3 ONLINE 0 0 0
nvd1p3 ONLINE 0 0 0
errors: No known data errors
------------------------------------
The second question is if instead of "zpool replace zroot nvd0" I had to use "zpool offline zroot nvd0" and "zpool online zroot nvd0" because nvd0 is the same disk (not replaced by a new disk).
Also I notice that if I don't do "zpool offline zroot nvd0" and "zpool online zroot nvd0" , but do a server reboot instead then zpool automatically puts nvd0 online:
$> zpool status
pool: zroot
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
scan: resilvered 3.50M in 00:00:00 with 0 errors on Thu May 12 01:04:09 2022
config:
NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nvd0p3 ONLINE 0 0 2
nvd1p3 ONLINE 0 0 0
errors: No known data errors
$> zpool clear zroot
$> zpool status
pool: zroot
state: ONLINE
scan: resilvered 3.50M in 00:00:00 with 0 errors on Thu May 12 01:04:09 2022
config:
NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nvd0p3 ONLINE 0 0 0
nvd1p3 ONLINE 0 0 0
errors: No known data errors