Re: Using a recovery partition to repair a broken installation of FreeBSD
- In reply to: Warner Losh : "Re: Using a recovery partition to repair a broken installation of FreeBSD"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 03 Sep 2025 09:28:34 UTC
On Tue, 2 Sep 2025 13:25:59 -0600 Warner Losh <imp@bsdimp.com> wrote: > On Tue, Sep 2, 2025 at 7:55 AM Tomoaki AOKI <junchoon@dec.sakura.ne.jp> > wrote: > > > On Mon, 1 Sep 2025 21:02:45 -0600 > > Warner Losh <imp@bsdimp.com> wrote: > > > > > On Mon, Sep 1, 2025 at 5:42 AM Tomoaki AOKI <junchoon@dec.sakura.ne.jp> > > > wrote: > > > > > > > On Mon, 1 Sep 2025 03:15:50 -0600 > > > > Warner Losh <imp@bsdimp.com> wrote: > > > > > > > > > On Mon, Sep 1, 2025, 3:05 AM Poul-Henning Kamp <phk@phk.freebsd.dk> > > > > wrote: > > > > > > > > > > > -------- > > > > > > Tomoaki AOKI writes: > > > > > > > > > > > > > > > > > > > > > … it would be nice to have something like 'recovery > > partition', > > > > as > > > > > > > > some OSes have. or at least some tiny fail-safe feature. having > > > > remote > > > > > > > > machine in some distant datacenter, booting from a flashstick > > is > > > > > > always > > > > > > > > a problem. > > > > > > > > > > > > I thought that is what /rescue is for ? > > > > > > > > > > > > > > > > That only works if your boot loader can read it... I've thought for a > > > > > while now that maybe we should move that into a ram disk image that > > we > > > > fall > > > > > back to if the boot loader can't read anything else... > > > > > > > > > > Warner > > > > > > > > Exactly. If the loader (or bootcode to kick the loader in the > > > > partition/pool) can sanely read the partition/pool to boot from, > > > > I think /rescue is enough and no need for rescue "partition / pool". > > > > > > > > But once the partition / pool to boot is broken (including lost > > > > decryption key for encrypted partitions/drives from regular place), > > > > something others are needed. > > > > > > > > And what can be chosen to boot from BIOS/UEFI firmware depends on > > > > the implementation (some could restrict per-drive only, instead of > > > > every entry in EFI boot manager table). > > > > > > > > If BIOS/firmware allow to choose "drive" to boot, rescue "drive" > > > > is useful, if multiple physical drives are available. > > > > > > > > Yes, rescue mfsroot embedded into loader.efi would be a candidate, too, > > > > if the size of ESP allows. > > > > > > > > > Rescue is quite small. On the order of 8MB compressed. The trouble is > > that > > > the kernel is like 12MB compressed, plus we'd need a few more modules. > > > Still, we could likely get something under 25MB that's an MD image that > > we > > > could boot into, but it would have to be single user. And It's been a > > while > > > since I did that... Typically I just run /rescue/init or /rescue/sh, > > which > > > isn't a full system and still uses the system's /etc. If we customized it > > > per system, we could do better, since the kernel can be a bit smaller > > > (compressed our kernels at work are 6MB), so under 20MB could be > > possible. > > > We'd not need /boot/loader.efi in there. > > > > Oh, much smaller than I've expected! > > > > Actually, using boot1.efi (either stock or patched), users of Root on > > ZFS can have rescue UFS partition on the same drive. > > This is because it looks for /boot/loader.efi to kick from ZFS pool > > first, then, UFS. This is per-drive priority and if both are NOT found, > > boot1.efi looks for another drive with the order that UEFI firmware > > recognized. (The first to try is the drive boot1.efi itself was kicked.) > > > > This is how smh@ implemented when I requested to fix boot issue > > on UEFI boot (at the moment, loader.efi cannot be kicked directly > > by UEFI firmware and needed boot1.efi). > > > > This isn't true, at least not generally. We load loader.efi in all new > installations by default. I've fixed a number of issues around this from > the past... We're not able to use it at netflix to boot off of ZFS, for > example... This is why I believe you're the best person to ask about loader. ;-) > > Maybe Warner would remember, before the fix, boot1.efi always looked for > > /boot/loader.efi with the order UEFI firmware recognized drives, > > thus, even if started from USB memstick for rescue, boot1.efi > > "always" kicked the first "internal" drive and cannot rescue. > > Yes, fresh installations was OK with it, as there's no /boot/loader.efi > > in any of internal drives. > > > > Yea, I'm not remembering it... It was late Jan., 2016. https://lists.freebsd.org/pipermail/freebsd-current/2016-January/059387.html > > > If we could hook into the arch specific traps that cause segv, etc, we > > > could do a setjmp early and set 'safe mode' and restart. Though that may > > > be trickier than I initially am thinking... maybe the best bet is to let > > > uefi catch that failure and have the next bootable BootXXXX environment > > on > > > the list specify a safe mode. More investigation might be needed. > > > > > > Warner > > > > Yeah, and it could be (and would actually be) implementation-specific. > > Maybe chaotic in real world and lots of quirks would be required. > > > > I don't understand that part... It would be architecture specific, but why > would it be implementation specific? > > Warner Even for mandatory features, some implementations that mis-understanding the spec can be implementation-specific, especially in early phase of the standard, unfortunately. Do you remember early PCI (not PCIe!) incompatibility issues? And early USB, too, IIRC. -- Tomoaki AOKI <junchoon@dec.sakura.ne.jp>