Re: Replacing a REMOVED drive in DEGRADED zpool

From: Robert <robert_at_webtent.org>
Date: Fri, 22 Aug 2025 04:43:39 UTC
On 8/21/2025 3:04 PM, David Christensen wrote:
> The wrong SATA HBA, cable, backplane, rack, etc., and/or a poor 
> connection anywhere can wreak havoc.  A few years ago, I bought all 
> new SATA III cables and SATA III mobile racks, and replaced the 
> previous mixture of SATA I, II, III stuff in my various computers.  
> Disk reliability improved dramatically.
>
>
> Re-seating all of the SATA cables, re-seating all of the drive power 
> cables, and re-seating the HBA, followed by `zpool online zdb1 ada0p3` 
> and `zpool scrub zdb1` could fix the problem.

Will try first, thanks!

> Okay -- BIOS/Legacy and GPT.  Those are key parameters (among others) 
> that the FreeBSD installer detects and uses to choose what actions to 
> take.  The goal is to reproduce what the installer did.

Which is...

root@db1:~ # cat /var/backups/gpart.ada0.bak
GPT 128
1   freebsd-boot        40      1024 gptboot0
2   freebsd-swap      2048  16777216 swap0
3    freebsd-zfs  16779264 276267008 zfs0

> Please run and post the following commands to check if ada[0-3]p1 
> bootloader stages are the same:
>
> # cmp /dev/ada1p1 /dev/ada0p1
Not until I resolve ada0.
> # cmp /dev/ada1p1 /dev/ada2p1
root@db1:~ # cmp /dev/ada1p1 /dev/ada2p1
/dev/ada1p1 /dev/ada2p1 differ: char 159233, line 417
> # cmp /dev/ada1p1 /dev/ada3p1

Interesting?

root@db1:~ # cmp /dev/ada1p1 /dev/ada3p1
root@db1:~ #

> Please run and post the following commands to check if ada[0-3] MBR's 
> are the same:
>
> # cmp <(dd if=/dev/ada1 count=1 status=none) <(dd if=/dev/ada0 count=1 
> status=none)
>
> # cmp <(dd if=/dev/ada1 count=1 status=none) <(dd if=/dev/ada2 count=1 
> status=none)
>
> # cmp <(dd if=/dev/ada1 count=1 status=none) <(dd if=/dev/ada3 count=1 
> status=none)
root@db1:~ # cmp <(dd if=/dev/ada1 count=1 status=none) <(dd 
if=/dev/ada0 count=1 status=none)
Missing name for redirect.
root@db1:~ # cmp <(dd if=/dev/ada1 count=1 status=none) <(dd 
if=/dev/ada2 count=1 status=none)
Missing name for redirect.
root@db1:~ # cmp <(dd if=/dev/ada1 count=1 status=none) <(dd 
if=/dev/ada3 count=1 status=none)
Missing name for redirect.
> Please run and post the following commands to confirm that gmirror and 
> gstripe are not in use.  Do not bother to load if prompted:
>
> # gmirror status
>
> # gstripe status

Yes, gmirror is in use, I'm guessing a `gmirror insert ada0p2` is neeed?

root@db1:~ # gmirror status
        Name    Status  Components
mirror/swap  DEGRADED  ada2p2 (ACTIVE)
                        ada3p2 (ACTIVE)
                        ada1p2 (ACTIVE)
root@db1:~ # gstripe status
gstripe: Command 'status' not available; try 'load' first.

> Please run and post the following command to confirm swap:
>
> # swapinfo
root@db1:~ # swapinfo
Device          1K-blocks     Used    Avail Capacity
/dev/mirror/swap   8388604    30216  8358388     0%

> Please post (redact as necessary) /boot/loader.conf, 
> /boot/loader.conf.d/*, /etc/rc.conf, and /etc/fstab  in case there is 
> anything else we need to consider.
root@db1:~ # cat /boot/loader.conf
geom_mirror_load="YES"
kern.geom.label.disk_ident.enable="0"
kern.geom.label.gptid.enable="0"
cryptodev_load="YES"
zfs_load="YES"
root@db1:~ # ls -lah /boot/loader.conf.d/
total 9
drwxr-xr-x   2 root  wheel     2B May 12  2022 .
drwxr-xr-x  15 root  wheel    71B Jul  4 11:33 ..
root@db1:~ # cat /etc/rc.conf
hostname="db1.REDACTED"
ifconfig_em0="inet REDACTED netmask 255.255.255.192"
defaultrouter="REDACTED.1"

# Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable
dumpdev="AUTO"
nrpe3_enable="YES"
sshd_enable="YES"
ntpdate_enable="YES"
ntpd_enable="YES"
named_enable="YES"
postgresql_enable="YES"
nfs_client_enable="YES"
nfs_server_enable="YES"
nfs_server_flags="-u -t -n 4"
rpcbind_enable="YES"
mountd_flags="-r"
mountd_enable="YES"
apache24_enable="YES"
postfix_enable="YES"
sendmail_enable="NONE"
sendmail_submit_enable="NO"
sendmail_outbound_enable="NO"
sendmail_msp_queue_enable="NO"
slapd_enable="YES"
slapd_flags='-h "ldapi://%2fvar%2frun%2fopenldap%2fldapi/ 
ldap://0.0.0.0:389/ ldaps://0.0.0.0:636/"'
slapd_sockets="/var/run/openldap/ldapi"
mrtg_daemon_enable="YES"
saslauthd_enable="YES"
saslauthd_flags="-a ldap"
root@db1:~ # cat /etc/fstab
# Device                Mountpoint      FStype  Options  Dump    Pass#
/dev/mirror/swap                none    swap    sw              0      0
REDACTED:/mnt/REDACTED /nfs/backup nfs rw 0 0
> Running and posting the commands above should help determine what to 
> do with ada0p1 and adap2.
Thanks for the help.
>> Ok, I have a matching drive, so I don't plan on reusing the REMOVED 
>> drive. I usually take the REMOVED one and examine it to see if it can 
>> be re-inserted safely as needed. I was hoping the gpart 
>> backup/restore would be the equivalent of sgdisk in Linux that I've 
>> used many times to duplicate a disk used for replacement in software 
>> RAID.
>
>
> I also use Linux, and FreeBSD gpart(8) backup and restore surprised me 
> in a bad way.  Please see the thread branch starting here:
>
> https://lists.freebsd.org/archives/freebsd-questions/2025-August/006883.html 
>
Yeah, I read. So, gpart backup/restore can be used as long as I throw 
the -l switch?
>>> I would use zpool-attach(8) to add the replacement ada0p3 as a 
>>> mirror of ada1p3.
>>
>> Attach in place of zpool-replace?
>
>
> AIUI RTFM zpool(8) if you detach one of two drives in a mirror, the 
> detached drive is forgotten, the mirror goes away, and the remaining 
> disk becomes striped at the top level.  When you install a new drive 
> and want to create a mirror with a singular striped drive, use 
> zpool(8) attach.

This is the plan now thanks to all ya'lls help, unless I get away with 
re-seating all and zpool online the disk.

--
Robert