From nobody Fri Aug 22 05:11:30 2025 X-Original-To: questions@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4c7SwQ002xz65HlX for ; Fri, 22 Aug 2025 05:11:49 +0000 (UTC) (envelope-from dpchrist@holgerdanske.com) Received: from holgerdanske.com (holgerdanske.com [184.105.128.27]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (prime256v1) server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "holgerdanske.com", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4c7SwN1T3Xz3ntD for ; Fri, 22 Aug 2025 05:11:48 +0000 (UTC) (envelope-from dpchrist@holgerdanske.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=holgerdanske.com header.s=nov-20210719-112354 header.b=ARJB+Pdk; dmarc=pass (policy=none) header.from=holgerdanske.com; spf=pass (mx1.freebsd.org: domain of dpchrist@holgerdanske.com designates 184.105.128.27 as permitted sender) smtp.mailfrom=dpchrist@holgerdanske.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=holgerdanske.com; s=nov-20210719-112354; t=1755839506; bh=BJxWB20H+npOYkHOVJeQKxR6mDM0fxft5hMMtQyEWdY=; h=Received:Message-ID:Date:MIME-Version:User-Agent:Subject:To: References:Content-Language:From:In-Reply-To:Content-Type: Content-Transfer-Encoding; b=ARJB+Pdk0s49yeANz7SXBBMz2Vdmbmmx3AgK5vm3AKIaLriYeLhckpHCQfW/TdXm7 9gk75ptq5iRCfbKp/tARG3LfJRYTmyDv1szB3sQ68+80HGrz5kdfwf/w1bGNkkPhSZ xvT+Bc5NXnx6PJYrV+Z42Wm1zvOo+TRg2hub7D0OFiUGUXyq4gZ8KIChJWLKqTZWFm 2MTl1dshywFKVG65f+B352iQmnzICiXOOvjv5ByRByrAZwfQrPo9zkQJ3kIDhL/Qod fjbCWXS0aDenHfdK4wDXj//ZkohuxSIqa661kk7GEJB3B9XTBYWcHIV8p4WRgyS0uA kuqTpExgnrUtdVuduPb/4wpf/3Gz7uY1+ajHT08+FJk7f2RgG+z5yXfdPfC978ircP QuhNn85gap4VTVonljv09rIcF56YKPbVleE8H99UDhpwwklT9lx7q2q4UPSRfV/6BI 5Qce2NvQPdGUUQPa76efED/uiYlgWIsioqFvnI5hEehLFt1c4tLRMs0tXcgNeicXmt hW6uQGG7nBxta9EOsElcSoXATJyjh8/9tscPeumWhmHe8fptQnD7rrn3LreIad7PwM FFYSE2MGFYxvL+/MkMzmTnnmXis/PQ8/YT9Sq94CzYiix65PhNrE3tUPveGt3k2wkn BHoF58eu/udLZCYwfFXePphI= Received: from 99.100.19.101 (99-100-19-101.lightspeed.frokca.sbcglobal.net [99.100.19.101]) by holgerdanske.com with ESMTPSA (TLS_AES_128_GCM_SHA256:TLSv1.3:Kx=any:Au=any:Enc=AESGCM(128):Mac=AEAD) (SMTP-AUTH username dpchrist@holgerdanske.com, mechanism PLAIN) for ; Thu, 21 Aug 2025 22:11:46 -0700 Message-ID: Date: Thu, 21 Aug 2025 22:11:30 -0700 List-Id: User questions List-Archive: https://lists.freebsd.org/archives/freebsd-questions List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: freebsd-questions@freebsd.org Sender: owner-freebsd-questions@FreeBSD.org MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Replacing a REMOVED drive in DEGRADED zpool To: questions@freebsd.org References: <86bjo9kvsv.fsf@ltc.des.dev> <86v7mgjy4c.fsf@ltc.des.dev> <5a74d92e-e6cc-4870-80fb-f2f1184b4d3e@webtent.org> Content-Language: en-US From: David Christensen In-Reply-To: <5a74d92e-e6cc-4870-80fb-f2f1184b4d3e@webtent.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.79 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.99)[-0.986]; DMARC_POLICY_ALLOW(-0.50)[holgerdanske.com,none]; R_DKIM_ALLOW(-0.20)[holgerdanske.com:s=nov-20210719-112354]; ONCE_RECEIVED(0.20)[]; R_SPF_ALLOW(-0.20)[+a]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; RCVD_COUNT_ONE(0.00)[1]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:6939, ipnet:184.104.0.0/15, country:US]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; MLMMJ_DEST(0.00)[questions@freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; ARC_NA(0.00)[]; TO_DN_NONE(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[questions@freebsd.org]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DKIM_TRACE(0.00)[holgerdanske.com:+] X-Rspamd-Queue-Id: 4c7SwN1T3Xz3ntD On 8/21/25 21:02, Robert wrote: > On 8/21/2025 3:03 PM, Dag-Erling Smørgrav wrote: >> You should take a look in /var/backups, you may find a backup of >> the partition table from the failed drive.  Assuming you remove the >> failed drive first, you can safely `gpart restore -l` this backup onto >> the replacement drive, which will recreate the labels (but not UUIDs). > > Great, had no idea, yes, I see the gpartada0.backup in /var/backups... > > root@db1:~ # cat /var/backups/gpart.ada0.bak <<-- REMOVED disk > GPT 128 > 1   freebsd-boot        40      1024 gptboot0 > 2   freebsd-swap      2048  16777216 swap0 > 3    freebsd-zfs  16779264 276267008 zfs0 > root@db1:~ # cat /var/backups/gpart.ada1.bak > GPT 128 > 1   freebsd-boot        40      1024 gptboot1 > 2   freebsd-swap      2048  16777216 swap1 > 3    freebsd-zfs  16779264 276267008 zfs1 > root@db1:~ # cat /var/backups/gpart.ada2.bak > GPT 128 > 1   freebsd-boot        40      1024 gptboot2 > 2   freebsd-swap      2048  16777216 swap2 > 3    freebsd-zfs  1677926l /v4 276267008 zfs2 > root@db1:~ # cat /var/backups/gpart.ada3.bak > GPT 128 > 1   freebsd-boot        40      1024 gptboot3 > 2   freebsd-swap      2048  16777216 swap3 > 3    freebsd-zfs  16779264 276267008 zfs3 > root@db1:~ # cat /var/backups/gpart.ada4.bak > Good. So long as nothing uses GUID/UUID, gpart(8) restore with labels should work. This is my server system disk (BIOS, MBR): 2025-08-21 21:13:19 toor@f5 ~ # gpart show ada0 => 40 117231328 ada0 GPT (56G) 40 1024 1 freebsd-boot (512K) 1064 29359104 2 freebsd-ufs (14G) 29360168 1564672 3 freebsd-swap (764M) 30924840 86306528 - free - (41G) I have a backup of the freebsd boot partition: 2025-08-21 21:55:05 toor@f5 ~ # ll /var/backups/boot.ada0p1.bak -rw-r--r-- 1 root wheel 524288 2024/03/04 03:01:00 /var/backups/boot.ada0p1.bak And the backup still matches adap1: 2025-08-21 21:13:44 toor@f5 ~ # cmp /dev/ada0p1 /var/backups/boot.ada0p1.bak 2025-08-21 21:14:00 toor@f5 ~ # echo $? 0 The last piece of the puzzle is the MBR. I see some possibilities in /boot: 2025-08-21 21:20:36 toor@f5 ~ # ll -S /boot | grep ' 512 ' | grep -v drwx -r--r--r-- 1 root wheel 512 2025/05/24 14:51:34 boot0 -r--r--r-- 1 root wheel 512 2025/05/24 14:51:34 boot0sio -r--r--r-- 1 root wheel 512 2023/04/06 21:24:38 boot1 -r--r--r-- 1 root wheel 512 2023/04/06 21:24:38 mbr -r--r--r-- 1 root wheel 512 2023/04/06 21:24:38 pmbr Referring to WikiPedia "Master boot record" table "Structure of a classical generic MBR": https://en.wikipedia.org/wiki/Master_boot_record The bootstrap code area is the first 446 bytes. Look for a match: 2025-08-21 21:24:08 toor@f5 ~ # cmp -n 446 /dev/ada0 /boot/boot0 /dev/ada0 /boot/boot0 differ: char 12, line 1 2025-08-21 21:25:00 toor@f5 ~ # cmp -n 446 /dev/ada0 /boot/boot0sio /dev/ada0 /boot/boot0sio differ: char 12, line 1 2025-08-21 21:25:05 toor@f5 ~ # cmp -n 446 /dev/ada0 /boot/boot1 /dev/ada0 /boot/boot1 differ: char 1, line 1 2025-08-21 21:25:08 toor@f5 ~ # cmp -n 446 /dev/ada0 /boot/mbr /dev/ada0 /boot/mbr differ: char 12, line 1 2025-08-21 21:25:12 toor@f5 ~ # cmp -n 446 /dev/ada0 /boot/pmbr So, the FreeBSD installer put /boot/pmbr into the MBR of my system disk. Checking the partition table entries and boot signature: 2025-08-21 21:28:19 toor@f5 ~ # cmp -i 446 -n 16 /dev/ada0 /boot/pmbr /dev/ada0 /boot/pmbr differ: char 3, line 1 2025-08-21 21:28:50 toor@f5 ~ # cmp -i 462 -n 16 /dev/ada0 /boot/pmbr 2025-08-21 21:28:58 toor@f5 ~ # cmp -i 478 -n 16 /dev/ada0 /boot/pmbr 2025-08-21 21:29:09 toor@f5 ~ # cmp -i 494 -n 16 /dev/ada0 /boot/pmbr 2025-08-21 21:29:17 toor@f5 ~ # cmp -i 510 -n 2 /dev/ada0 /boot/pmbr So, everything matches except partition entry number 1: 2025-08-21 21:31:33 toor@f5 ~ # dd if=/dev/ada0 count=1 status=none | hexdump -s 446 -n 16 000001be 00 00 02 00 ee ff ff ff 01 00 00 00 2f cf fc 06 |............/...| 000001ce 2025-08-21 21:32:27 toor@f5 ~ # dd if=/boot/pmbr count=1 status=none | hexdump -s 446 -n 16 000001be 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000001ce So, the installer must have populated the first partition entry. Referring to the WikiPedia page table "Layout of one 16-byte partition entry", decoding my MBR first partition entry: Status or physical drive inactive CHS address of first absolute sector in partition cylinder = 0 head = 0 sector = 2 Partition type ee = GPT protective MBR CHS adress of last absolute sector in partition cylinder = 1023 head = 255 sector = 31 LBA of first absolute sector in the partition 0x00000001 = sector 1 Number of sectors in partition 0x06fccf2f = 117231407 sectors Convert the number of sectors in partition field value to decimal: 2025-08-21 21:32:37 toor@f5 ~ # perl -e 'printf "%i\n", 0x06fccf2f' 117231407 This matches the disk size minus 1 (for the MBR): 2025-08-21 21:54:57 toor@f5 ~ # diskinfo -v ada0 | grep 'mediasize in sectors' 117231408 # mediasize in sectors Again, I would check if the failed disk and the other disck all have the same MBR. If so, you could clone one of them into the MBR of replacement disk. >>> Would recovering the disk be beneficial versus replace? As far as >>> faster recovery, not needing to resilver or as much. These are not big >>> drives as you can see and RAID10 zpool. >> You can try to use recoverdisk to copy undamaged portions of the failed >> drive onto the replacement, but it's likely to take longer than >> resilvering. > Then I'll stick to the original plan but with attach instead of replace > using `zpool attach ada0p3 ada0p3`. > I think you have a typo -- the replacement ada0p3 should attach to ada1p3. David