Re: Replacing a REMOVED drive in DEGRADED zpool

From: Robert <robert_at_webtent.org>
Date: Thu, 21 Aug 2025 13:46:31 UTC
On 8/21/2025 2:14 AM, David Christensen wrote:
> On 8/20/25 17:55, Robert wrote:
>
> I am used to seeing non-zero numbers in the READ, WRITE, and/or CKSUM 
> columns for a bad disk.  Did you do something to reset the numbers?

I know what you mean, I have seen REMOVED using SSD on an IBM x3550 
servers using mdadm as well, does the SATA connector hiccup? I haven't 
done a thing, no reset or clear. So, I know I could perhaps use the same 
drive, but I'll need to smart check it after pulling to be sure.

>> The disk entered the REMOVED state at 6am this morning, a little over 
>> 14 hours ago 
>
>
> `zpool status`, above, said "One or more devices has been removed by 
> the administrator".  Did a human remove the disk or did ZFS? If a 
> human, what command did they use?

Nope, nobody, this server is in a data center and I'm quite sure no one 
touched it.
>> and I plan to replace on Friday night to give myself some time in 
>> case a restore needs to happen. Perhaps I should bump the local 
>> snapshot storage up to 168 hours (1 week) as well at this point or 
>> hold what is there, can I hold all snapshots with one command? Here 
>> is the disk info for the 3 drives remaining in the zpool ...
>>
>> root@db1:~ # camcontrol devlist
>> <WDC WD1500ADFD-00NLR5 21.07QR5>   at scbus1 target 0 lun 0 (ada1,pass1)
>> <WDC WD1500HLFS-01G6U3 04.04V05>   at scbus2 target 0 lun 0 (ada2,pass2)
>> <WDC WD1500ADFD-00NLR5 21.07QR5>   at scbus3 target 0 lun 0 (ada3,pass3)
>
>
> ZFS RAID10 with Raptors and VelociRaptor -- a blast from the past!  :-)
>
>
> The problem with vintage systems is "do not throw good money after 
> bad".  If you already have a spare Raptor or VelociRaptor and all four 
> disks test and report good with smartctl(8), then perhaps replacing 
> the failed disk with another disk is a good idea. Otherwise, I would 
> consider other options (a pair of SSD's).

Lol, yep, I believe I have a matching drive sitting in waiting but will 
confirm when I get to the data center later today. It's an old 
Supermicro that has been like a rock. I don't throw away things unless 
they fail or not support somehow. I use redundant hardware and/or 
services. I had servers reach 15 years, but this is almost unachievable 
these days with technology advancing faster and faster. The redundant db 
server for this one does use SSD drives.

>> root@db1:~ # gpart show ada1
>> =>       40  293046688  ada1  GPT  (140G)
>>           40       1024     1  freebsd-boot  (512K)
>>         1064        984        - free -  (492K)
>>         2048   16777216     2  freebsd-swap  (8.0G)
>>     16779264  276267008     3  freebsd-zfs  (132G)
>>    293046272        456        - free -  (228K)
>
>
> What partition scheme is on the disks?
>
>
> I do not see an EFI system partition.  Is the motherboard firmware 
> BIOS/Legacy or UEFI?
Partition scheme? As I showed with gpart, these drives have a GUID 
Partition Table with partitions as shown. I believe this server is 
legacy and does not supports UEFI.
>
>
> How is ada0p1 freebsd-boot configured into the system?  ZFS 
> stripe-of-mirrors?  UFS gmirror/gstripe RAID10?
>
>
> How is ada0p2 freebsd-swap configured into the system?  One of four 
> swap devices?

Yes, I am concerned about the boot, I guess I need to `gpart bootcode -b 
/boot/pmbr -p /boot/gptzfsboot -i 1 ada?` the new drive after I replace?

>> All the drive report identical layouts as ada1. I've used camcontrol 
>> with identify to get all the serial numbers of these drives, so I 
>> plan to shut the server down, pull the bad drive and insert the 
>> replacement, boot up and replace. Would these be the steps I need to 
>> take assuming the replacement drive shows up as the same ada0 device?
>>
>> 1. Run `zpool offline zdb1 ada0p3`
>
>
> I would use zpool-detach(8) to remove the failed disk from the pool.

Yes, I was perhaps expecting an error when I try to offline, I will 
instead try to detach.

> You will need to disconnect ada0p1 freebsd-boot and  ada0p2 
> freebsd-swap according to how they are configured into your system.

This is where I'm not sure. I used the default FreeBSD installation with 
ZFS and selected the drives for redundancy I believe. I never did this 
before this server and always used UFS with gmirror before. But after 
seeing the default in FreeBSD for a long time now, I decided to use. I 
get more risky when I have a redundant services as in this case.

> Cloning ada1's GPT's (primary and secondary) to ada0 will result in 
> duplicate identifiers on two disks -- UUID's, labels, etc..  Two disks 
> with matching identifiers in the same computer is asking for trouble.  
> I would not do that.  If anything, clone the failed disk GPT's to the 
> replacement disk GPT's -- but only if the failed disk GPT's are good.
>
>
> If the failed disk is still mostly operational with bad blocks all 
> within the middle data portion of ada0p3 (NOT in metadata), cloning 
> the failed disk to the replacement disk could save effort.  
> ddrescue(1) may be required to get past bad blocks.
>
>
> Otherwise, I would zero the replacement disk and build it manually.

Ok, I have a matching drive, so I don't plan on reusing the REMOVED 
drive. I usually take the REMOVED one and examine it to see if it can be 
re-inserted safely as needed. I was hoping the gpart backup/restore 
would be the equivalent of sgdisk in Linux that I've used many times to 
duplicate a disk used for replacement in software RAID.

> I would use zpool-attach(8) to add the replacement ada0p3 as a mirror 
> of ada1p3.

Attach in place of zpool-replace?

> You will need to build and connect the replacement ada0p1 freebsd-boot 
> and replacement ada0p2 freebsd-swap according to how they are to be 
> configured into your system.

Again, I used the default FreeBSD ZFS install. But this is where I 
believe I need to tweak my recovery if gpart backup/restore does not 
prepare the disk as expected.

> Finally, ZFS, ZFS stripe-of-mirrors, root-on-ZFS, and gmirror/gstripe 
> RAID10 are all non-trivial. Replacing such a disk correctly is going 
> to require a lot of knowledge.  If you like learning adventures, go 
> for it.
>
>
> But if you want 24x7 operations, I do better with 
> backup/wipe/install/restore.  It is simpler, I can estimate how long 
> it will take, I can roll it back, and I have confidence in the 
> results.  If you go this route, I would put FreeBSD on UFS on a single 
> small SSD and put the data on ZFS with redundant disks. Backup the OS 
> disk with rsync(1) and take images regularly. Restoring an OS disk 
> from an image is the fastest way to recover from a OS disk disaster.

This is my first time recovering from a ROOT zpool, I've used ZFS as you 
said for many years with UFS and gmirror two drives for the OS. I would 
perhaps like to experiment with restoring from snapshot if someone could 
outline the differences handling ROOT zpool using zfs send/receive I 
assume just like I've done with ZFS for data only. But for this case, 
I'd like to replace as designed and get the server back in operation.

Thanks for all the pointers!