Re: Replacing a REMOVED drive in DEGRADED zpool
- Reply: Robert : "Re: Replacing a REMOVED drive in DEGRADED zpool"
- In reply to: Robert : "Re: Replacing a REMOVED drive in DEGRADED zpool"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 21 Aug 2025 19:04:56 UTC
On 8/21/25 06:46, Robert wrote: > On 8/21/2025 2:14 AM, David Christensen wrote: >> On 8/20/25 17:55, Robert wrote: >> >> I am used to seeing non-zero numbers in the READ, WRITE, and/or CKSUM >> columns for a bad disk. Did you do something to reset the numbers? > > I know what you mean, I have seen REMOVED using SSD on an IBM x3550 > servers using mdadm as well, does the SATA connector hiccup? The wrong SATA HBA, cable, backplane, rack, etc., and/or a poor connection anywhere can wreak havoc. A few years ago, I bought all new SATA III cables and SATA III mobile racks, and replaced the previous mixture of SATA I, II, III stuff in my various computers. Disk reliability improved dramatically. Re-seating all of the SATA cables, re-seating all of the drive power cables, and re-seating the HBA, followed by `zpool online zdb1 ada0p3` and `zpool scrub zdb1` could fix the problem. > Partition scheme? As I showed with gpart, these drives have a GUID > Partition Table with partitions as shown. I believe this server is > legacy and does not supports UEFI. Okay -- BIOS/Legacy and GPT. Those are key parameters (among others) that the FreeBSD installer detects and uses to choose what actions to take. The goal is to reproduce what the installer did. RTFM gpart(8) has some good information about bootloader stages and what pieces are needed for the various combinations of BIOS/UEFI, MBR/GPT, UFS/ZFS/gmirror/gstripe, etc.. Much of the FreeBSD installer is a suite of shell scripts. They are very well written and reasonably easy to read. Crawling the installer code is another possible source of information. > Yes, I am concerned about the boot, I tried a manual install of FreeBSD not too long ago by following Lucas ([1] and [2]?). Trying to remember those, looking at your previous `gpart show ada1` console session, knowing the computer is BIOS/Legacy, and knowing the disks are GPT, I now believe ada[0-4]p1 are bootloader stages that run after the MBR bootloader stage. They should all be the same; gmirror/gstripe should not be involved. Please run and post the following commands to check if ada[0-3]p1 bootloader stages are the same: # cmp /dev/ada1p1 /dev/ada0p1 # cmp /dev/ada1p1 /dev/ada2p1 # cmp /dev/ada1p1 /dev/ada3p1 Please run and post the following commands to check if ada[0-3] MBR's are the same: # cmp <(dd if=/dev/ada1 count=1 status=none) <(dd if=/dev/ada0 count=1 status=none) # cmp <(dd if=/dev/ada1 count=1 status=none) <(dd if=/dev/ada2 count=1 status=none) # cmp <(dd if=/dev/ada1 count=1 status=none) <(dd if=/dev/ada3 count=1 status=none) Please run and post the following commands to confirm that gmirror and gstripe are not in use. Do not bother to load if prompted: # gmirror status # gstripe status Please run and post the following command to confirm swap: # swapinfo Please post (redact as necessary) /boot/loader.conf, /boot/loader.conf.d/*, /etc/rc.conf, and /etc/fstab in case there is anything else we need to consider. > I guess I need to `gpart bootcode - > b /boot/pmbr -p /boot/gptzfsboot -i 1 ada?` the new drive after I replace? >> You will need to disconnect ada0p1 freebsd-boot and ada0p2 freebsd- >> swap according to how they are configured into your system. > > This is where I'm not sure. I used the default FreeBSD installation with > ZFS and selected the drives for redundancy I believe. I never did this > before this server and always used UFS with gmirror before. But after > seeing the default in FreeBSD for a long time now, I decided to use. I > get more risky when I have a redundant services as in this case. Running and posting the commands above should help determine what to do with ada0p1 and adap2. > Ok, I have a matching drive, so I don't plan on reusing the REMOVED > drive. I usually take the REMOVED one and examine it to see if it can be > re-inserted safely as needed. I was hoping the gpart backup/restore > would be the equivalent of sgdisk in Linux that I've used many times to > duplicate a disk used for replacement in software RAID. I also use Linux, and FreeBSD gpart(8) backup and restore surprised me in a bad way. Please see the thread branch starting here: https://lists.freebsd.org/archives/freebsd-questions/2025-August/006883.html >> I would use zpool-attach(8) to add the replacement ada0p3 as a mirror >> of ada1p3. > > Attach in place of zpool-replace? AIUI RTFM zpool(8) if you detach one of two drives in a mirror, the detached drive is forgotten, the mirror goes away, and the remaining disk becomes striped at the top level. When you install a new drive and want to create a mirror with a singular striped drive, use zpool(8) attach. > Again, I used the default FreeBSD ZFS install. But this is where I > believe I need to tweak my recovery if gpart backup/restore does not > prepare the disk as expected. Yes. > Thanks for all the pointers! YW. There is still more to figure out, but you are getting closer. David [1] https://mwl.io/nonfiction/os#af3e [2] https://mwl.io/nonfiction/os#fmse