Re: ZFS replace a mirrored disk

In reply to: Christos Chatzaras : "ZFS replace a mirrored disk"
Go to: [ bottom of page ] [ top of archives ] [ this month ]
From: David Christensen <dpchrist_at_holgerdanske.com>
Date: Wed, 11 May 2022 19:24:39 UTC
On 5/11/22 03:22, Christos Chatzaras wrote:
> When a disk fails and want to replace it with a NEW disk are these commands correct?
> 
> ----
> gpart backup nvd1 | gpart restore -F nvd0
> gmirror forget swap
> gmirror insert swap /dev/nvd0p2
> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 nvd0
> zpool replace zroot nvd0
> ----
> 
> 
> I try to simulate a disk "failure". I tried the above commands with the SAME disk without success. I believe the "issue" is because the "new" disk is the SAME as before.
> 
> I did these steps:
> 
> 1) Boot with mfsBSD and did "gpart destroy -F nvd0"
> 2) Reboot the server in the main OS.
> 
> If the "new" disk is the SAME as before do I have to change the commands to these?
> 
> ----
> gpart backup nvd1 | gpart restore -F nvd0
> gmirror forget swap
> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 nvd0
> zpool offline zroot nvd0
> zpool online zroot nvd0
> ----
> 
> 
> Also I notice that with the SAME disk "gmirror forget swap" starts rebuilding swap immediately and "gmirror insert swap /dev/nvd0p2" is not needed. Is this the correct behaviour when the "new" disk is the SAME?


If you are using ZFS, you should use ZFS commands to replace drives.


I had a bad HDD in a ZFS pool about two years ago:

         2020-03-01 15:55:19 toor@f3 ~
         # zpool status p3
           pool: p3
          state: DEGRADED
         status: One or more devices could not be opened.  Sufficient 
replicas exist for
                 the pool to continue functioning in a degraded state.
         action: Attach the missing device and online it using 'zpool 
online'.
            see: http://illumos.org/msg/ZFS-8000-2Q
           scan: none requested
         config:

                 NAME                     STATE     READ WRITE CKSUM
                 p3                       DEGRADED     0     0     0
                   mirror-0               DEGRADED     0     0     0
                     4744083090086529196  UNAVAIL      0     0     0 
was /dev/gpt/p3a.eli
                     gpt/p3b.eli          ONLINE       0     0     0

         errors: No known data errors


I physically removed the bad HDD, installed a new HDD, created a GPT 
partition table, added one large partition, initialized a GELI provider, 
saved the GELI metadata backup, edited /etc/rc.conf with the revised 
GELI settings, attached the new GELI provider, and then used zpool(8) to 
replace the old provider with the new provider:

         2020-03-01 15:59:48 toor@f3 ~
         # zpool replace p3 4744083090086529196 gpt/p3c.eli

         2020-03-01 16:01:12 toor@f3 ~
         # zpool status p3
           pool: p3
          state: DEGRADED
         status: One or more devices is currently being resilvered.  The 
pool will
                 continue to function, possibly in a degraded state.
         action: Wait for the resilver to complete.
           scan: resilver in progress since Sun Mar  1 16:01:08 2020
                 2.64G scanned at 159M/s, 0 issued at 0/s, 1.71T total
                 0 resilvered, 0.00% done, no estimated completion time
         config:

                 NAME                       STATE     READ WRITE CKSUM
                 p3                         DEGRADED     0     0     0
                   mirror-0                 DEGRADED     0     0     0
                     replacing-0            UNAVAIL      0     0     0
                       4744083090086529196  UNAVAIL      0     0     0 
was /dev/gpt/p3a.eli
                       gpt/p3c.eli          ONLINE       0     0     0
                     gpt/p3b.eli            ONLINE       0     0     0

         errors: No known data errors


A few weeks later, I scrubbed the pool:

         2020-03-26 23:09:16 toor@f3 ~
         # zpool scrub p3


This was the pool status when the scrub finished:

         2020-03-27 12:52:27 toor@f3 ~
         # zpool status p3
           pool: p3
          state: ONLINE
           scan: scrub repaired 0 in 0 days 04:42:14 with 0 errors on 
Fri Mar 27 03:51:36 2020
         config:

                 NAME             STATE     READ WRITE CKSUM
                 p3               ONLINE       0     0     0
                   mirror-0       ONLINE       0     0     0
                     gpt/p3c.eli  ONLINE       0     0     0
                     gpt/p3b.eli  ONLINE       0     0     0

         errors: No known data errors


David