From nobody Tue Sep 02 13:55:00 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4cGS1H5FVlz66lXW for ; Tue, 02 Sep 2025 13:55:15 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Received: from www121.sakura.ne.jp (www121.sakura.ne.jp [153.125.133.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4cGS1H10y6z3f6F for ; Tue, 02 Sep 2025 13:55:14 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Authentication-Results: mx1.freebsd.org; none Received: from kalamity.joker.local (124-18-6-240.area1c.commufa.jp [124.18.6.240]) (authenticated bits=0) by www121.sakura.ne.jp (8.18.1/8.17.1/[SAKURA-WEB]/20201212) with ESMTPA id 582Dt096094243; Tue, 2 Sep 2025 22:55:02 +0900 (JST) (envelope-from junchoon@dec.sakura.ne.jp) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=dec.sakura.ne.jp; s=s2405; t=1756821303; bh=WqL1xzAA6EdLksmwJrCHHKEFE2VAX6VYwZP0Mysr5hQ=; h=Date:From:To:Cc:Subject:In-Reply-To:References; b=KRw6nK6XQ0wEHIOEnqKIWkMvkK8X+uaAkEmpiwpYzv9cgZ4T++gkdyGfOOI/St8WR FtJF2w4cUOwSRuGg02uv76y3Wbh5l1uVwL6YpnbqzkIro07359TXe47HoRKllprC6n VddObDlIlDvXFZmZd7YTwRZjjul7AAS1vFkG63sY= Date: Tue, 2 Sep 2025 22:55:00 +0900 From: Tomoaki AOKI To: Warner Losh Cc: Poul-Henning Kamp , Graham Perrin , FreeBSD-CURRENT Subject: Re: Using a recovery partition to repair a broken installation of FreeBSD Message-Id: <20250902225500.70577e08c0584754e743bac9@dec.sakura.ne.jp> In-Reply-To: References: <7b384ac0-9b24-43a4-bf63-012d745155a7@gmail.com> <18e1a7e9-07d8-43a2-96af-0acdab6c2920@gmail.com> <20250901175827.73ba0ea24812cebe2263811f@dec.sakura.ne.jp> <202509010904.58194iP2007318@critter.freebsd.dk> <20250901204243.6548150b14d79d2eab04ad3d@dec.sakura.ne.jp> Organization: Junchoon corps X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; amd64-portbld-freebsd14.3) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:7684, ipnet:153.125.128.0/18, country:JP] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Queue-Id: 4cGS1H10y6z3f6F On Mon, 1 Sep 2025 21:02:45 -0600 Warner Losh wrote: > On Mon, Sep 1, 2025 at 5:42 AM Tomoaki AOKI > wrote: > > > On Mon, 1 Sep 2025 03:15:50 -0600 > > Warner Losh wrote: > > > > > On Mon, Sep 1, 2025, 3:05 AM Poul-Henning Kamp > > wrote: > > > > > > > -------- > > > > Tomoaki AOKI writes: > > > > > > > > > > > > > > > … it would be nice to have something like 'recovery partition', > > as > > > > > > some OSes have. or at least some tiny fail-safe feature. having > > remote > > > > > > machine in some distant datacenter, booting from a flashstick is > > > > always > > > > > > a problem. > > > > > > > > I thought that is what /rescue is for ? > > > > > > > > > > That only works if your boot loader can read it... I've thought for a > > > while now that maybe we should move that into a ram disk image that we > > fall > > > back to if the boot loader can't read anything else... > > > > > > Warner > > > > Exactly. If the loader (or bootcode to kick the loader in the > > partition/pool) can sanely read the partition/pool to boot from, > > I think /rescue is enough and no need for rescue "partition / pool". > > > > But once the partition / pool to boot is broken (including lost > > decryption key for encrypted partitions/drives from regular place), > > something others are needed. > > > > And what can be chosen to boot from BIOS/UEFI firmware depends on > > the implementation (some could restrict per-drive only, instead of > > every entry in EFI boot manager table). > > > > If BIOS/firmware allow to choose "drive" to boot, rescue "drive" > > is useful, if multiple physical drives are available. > > > > Yes, rescue mfsroot embedded into loader.efi would be a candidate, too, > > if the size of ESP allows. > > > Rescue is quite small. On the order of 8MB compressed. The trouble is that > the kernel is like 12MB compressed, plus we'd need a few more modules. > Still, we could likely get something under 25MB that's an MD image that we > could boot into, but it would have to be single user. And It's been a while > since I did that... Typically I just run /rescue/init or /rescue/sh, which > isn't a full system and still uses the system's /etc. If we customized it > per system, we could do better, since the kernel can be a bit smaller > (compressed our kernels at work are 6MB), so under 20MB could be possible. > We'd not need /boot/loader.efi in there. Oh, much smaller than I've expected! Actually, using boot1.efi (either stock or patched), users of Root on ZFS can have rescue UFS partition on the same drive. This is because it looks for /boot/loader.efi to kick from ZFS pool first, then, UFS. This is per-drive priority and if both are NOT found, boot1.efi looks for another drive with the order that UEFI firmware recognized. (The first to try is the drive boot1.efi itself was kicked.) This is how smh@ implemented when I requested to fix boot issue on UEFI boot (at the moment, loader.efi cannot be kicked directly by UEFI firmware and needed boot1.efi). Maybe Warner would remember, before the fix, boot1.efi always looked for /boot/loader.efi with the order UEFI firmware recognized drives, thus, even if started from USB memstick for rescue, boot1.efi "always" kicked the first "internal" drive and cannot rescue. Yes, fresh installations was OK with it, as there's no /boot/loader.efi in any of internal drives. > If we could hook into the arch specific traps that cause segv, etc, we > could do a setjmp early and set 'safe mode' and restart. Though that may > be trickier than I initially am thinking... maybe the best bet is to let > uefi catch that failure and have the next bootable BootXXXX environment on > the list specify a safe mode. More investigation might be needed. > > Warner Yeah, and it could be (and would actually be) implementation-specific. Maybe chaotic in real world and lots of quirks would be required. -- Tomoaki AOKI