From nobody Tue Sep 02 19:25:59 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4cGbMG6ljGz65GHD for ; Tue, 02 Sep 2025 19:26:18 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4cGbMG4l3Mz3gmT for ; Tue, 02 Sep 2025 19:26:18 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-pj1-x102d.google.com with SMTP id 98e67ed59e1d1-327771edfbbso5802588a91.0 for ; Tue, 02 Sep 2025 12:26:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20230601.gappssmtp.com; s=20230601; t=1756841172; x=1757445972; darn=freebsd.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ocqLVk+omQbAJc57yq5LEAy7i4YWe4zVt88QZI8IKOI=; b=v19+UedB4yNaZlb/3HvT+yNQHnWd4NahyoTdiNldQU48J/mujiPhaDxz2fT/iWeres rgJWLMDrG9+29g8w4DYaJQaJSw8XEcm49mNrWXzTgxoltpa32DcnbKYvQkBjlUV33Evf zs1AvZtiSPjDwINx6VyFcT52GcIiYsaYfi1ArK2kjPFuIO4f3mmLh/BkQsSvYARJuc/A SMXeW4KFekwP3bObqYeNX/dkfijO5SXhNsqvuRMMIXbZtZebCgMHV1nfKJ+nFB1k5XlH i7p8wanWhIBA4qE0Qq+SmL3cYOeX/KiCK7DepdIqodiIubg/ajBf8JZ2w7brr3oJ7Azn 9wzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756841172; x=1757445972; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ocqLVk+omQbAJc57yq5LEAy7i4YWe4zVt88QZI8IKOI=; b=opFdkOauyouLarOJ29xW0zOJzDBK857Ig408HLvCy0unhx/wEzunlTacis1aT5bi4p r4JntSoaROTNbOLmD9vob+SsIyngZHai1PRlRcaagR3w/82RJ1J5JbiYFxYE49gDMeI8 Ufxbf1V86WMeShljrLfaI409eyL3I2yzdxyvs632PeDVJIPcC+F7iXJ3si2nVHyEtD1H OJX7aYL2KZvyEQajcG7uCKsJjKnVMTSmFa6HSA8atD5gSxjervNgAmrB7XZNry2spYbP JfvjmESEr6/vu3xM0yL+UoE8+3JoR7t+1FW1+/TtXTVoymvwu9JhdebiZ4hi0+Jvr08+ sXtw== X-Forwarded-Encrypted: i=1; AJvYcCXWTKmivu05eQUPcZQq3g0YWq3zzy4WzVVK9k4LtWBmW3Bk2iLHSGCv7nTQwRr4V8ihAzIaO+qOHjyndus7ecA=@freebsd.org X-Gm-Message-State: AOJu0YwWbcsmEqSPlAEHU08xI8+8tkmSicdFCyguitk8zUbwPTYuG+O/ 86lXIs+veukUCoV8XAyavqwGgSOlo3bVUCIfQ9ecFm6cGw5Vo8ikpEaixEzNH37fpv/6+nUN3PQ gvk295C1Y11fxkCKPcIE8car8siteI/vTffiISYdNNg== X-Gm-Gg: ASbGncvP0Sx8JAErZBfdZUV9rtKGL7nSV9vdZ6YcT5/nHeNxDWgiW0GVCJkHh6xMdow NckXfNmZTrJsW3pDv3N/FmNJ6KXgBXz8dmx/X0yPt2497Hxhj20fiF8UgZKD8d1vPABFe9aiytN 4+DBczV83Fq5+HUkjeZceRtCSh2CaqirVO2UOCi20+wwn75a4ne3dFOhGc/wcPtc9hp+nVpuGEo OGRYuxB2ffXyHb0oYvsxZC1VpkIKca4vC63o8TfAk4O7F+A2w== X-Google-Smtp-Source: AGHT+IGKLemKNt5QdOebjjgKgwx512YBxC2n1s0v/+46Hzl2wLc9cwwOBOmmrXRRaQjp9Nhm71SnNVqUAsZgBRhVCDs= X-Received: by 2002:a17:90b:53cc:b0:327:9e88:7714 with SMTP id 98e67ed59e1d1-328156f991bmr16946151a91.37.1756841172110; Tue, 02 Sep 2025 12:26:12 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 References: <7b384ac0-9b24-43a4-bf63-012d745155a7@gmail.com> <18e1a7e9-07d8-43a2-96af-0acdab6c2920@gmail.com> <20250901175827.73ba0ea24812cebe2263811f@dec.sakura.ne.jp> <202509010904.58194iP2007318@critter.freebsd.dk> <20250901204243.6548150b14d79d2eab04ad3d@dec.sakura.ne.jp> <20250902225500.70577e08c0584754e743bac9@dec.sakura.ne.jp> In-Reply-To: <20250902225500.70577e08c0584754e743bac9@dec.sakura.ne.jp> From: Warner Losh Date: Tue, 2 Sep 2025 13:25:59 -0600 X-Gm-Features: Ac12FXykis3u2BNDkHNazu5ONsF3yToZDldPgTL_isFLIgLEFYpIMSaTfxCkajc Message-ID: Subject: Re: Using a recovery partition to repair a broken installation of FreeBSD To: Tomoaki AOKI Cc: Poul-Henning Kamp , Graham Perrin , FreeBSD-CURRENT Content-Type: multipart/alternative; boundary="000000000000fda580063dd675bf" X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Queue-Id: 4cGbMG4l3Mz3gmT --000000000000fda580063dd675bf Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Sep 2, 2025 at 7:55=E2=80=AFAM Tomoaki AOKI wrote: > On Mon, 1 Sep 2025 21:02:45 -0600 > Warner Losh wrote: > > > On Mon, Sep 1, 2025 at 5:42=E2=80=AFAM Tomoaki AOKI > > wrote: > > > > > On Mon, 1 Sep 2025 03:15:50 -0600 > > > Warner Losh wrote: > > > > > > > On Mon, Sep 1, 2025, 3:05=E2=80=AFAM Poul-Henning Kamp > > > wrote: > > > > > > > > > -------- > > > > > Tomoaki AOKI writes: > > > > > > > > > > > > > > > > > > =E2=80=A6 it would be nice to have something like 'recover= y > partition', > > > as > > > > > > > some OSes have. or at least some tiny fail-safe feature. havi= ng > > > remote > > > > > > > machine in some distant datacenter, booting from a flashstick > is > > > > > always > > > > > > > a problem. > > > > > > > > > > I thought that is what /rescue is for ? > > > > > > > > > > > > > That only works if your boot loader can read it... I've thought for= a > > > > while now that maybe we should move that into a ram disk image that > we > > > fall > > > > back to if the boot loader can't read anything else... > > > > > > > > Warner > > > > > > Exactly. If the loader (or bootcode to kick the loader in the > > > partition/pool) can sanely read the partition/pool to boot from, > > > I think /rescue is enough and no need for rescue "partition / pool". > > > > > > But once the partition / pool to boot is broken (including lost > > > decryption key for encrypted partitions/drives from regular place), > > > something others are needed. > > > > > > And what can be chosen to boot from BIOS/UEFI firmware depends on > > > the implementation (some could restrict per-drive only, instead of > > > every entry in EFI boot manager table). > > > > > > If BIOS/firmware allow to choose "drive" to boot, rescue "drive" > > > is useful, if multiple physical drives are available. > > > > > > Yes, rescue mfsroot embedded into loader.efi would be a candidate, to= o, > > > if the size of ESP allows. > > > > > > Rescue is quite small. On the order of 8MB compressed. The trouble is > that > > the kernel is like 12MB compressed, plus we'd need a few more modules. > > Still, we could likely get something under 25MB that's an MD image that > we > > could boot into, but it would have to be single user. And It's been a > while > > since I did that... Typically I just run /rescue/init or /rescue/sh, > which > > isn't a full system and still uses the system's /etc. If we customized = it > > per system, we could do better, since the kernel can be a bit smaller > > (compressed our kernels at work are 6MB), so under 20MB could be > possible. > > We'd not need /boot/loader.efi in there. > > Oh, much smaller than I've expected! > > Actually, using boot1.efi (either stock or patched), users of Root on > ZFS can have rescue UFS partition on the same drive. > This is because it looks for /boot/loader.efi to kick from ZFS pool > first, then, UFS. This is per-drive priority and if both are NOT found, > boot1.efi looks for another drive with the order that UEFI firmware > recognized. (The first to try is the drive boot1.efi itself was kicked.) > > This is how smh@ implemented when I requested to fix boot issue > on UEFI boot (at the moment, loader.efi cannot be kicked directly > by UEFI firmware and needed boot1.efi). > This isn't true, at least not generally. We load loader.efi in all new installations by default. I've fixed a number of issues around this from the past... We're not able to use it at netflix to boot off of ZFS, for example... > Maybe Warner would remember, before the fix, boot1.efi always looked for > /boot/loader.efi with the order UEFI firmware recognized drives, > thus, even if started from USB memstick for rescue, boot1.efi > "always" kicked the first "internal" drive and cannot rescue. > Yes, fresh installations was OK with it, as there's no /boot/loader.efi > in any of internal drives. > Yea, I'm not remembering it... > > If we could hook into the arch specific traps that cause segv, etc, we > > could do a setjmp early and set 'safe mode' and restart. Though that m= ay > > be trickier than I initially am thinking... maybe the best bet is to le= t > > uefi catch that failure and have the next bootable BootXXXX environment > on > > the list specify a safe mode. More investigation might be needed. > > > > Warner > > Yeah, and it could be (and would actually be) implementation-specific. > Maybe chaotic in real world and lots of quirks would be required. > I don't understand that part... It would be architecture specific, but why would it be implementation specific? Warner --000000000000fda580063dd675bf Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Tue, Sep 2, = 2025 at 7:55=E2=80=AFAM Tomoaki AOKI <junchoon@dec.sakura.ne.jp> wrote:
On Mon, 1 Sep 2025 21:02:45 -0600
Warner Losh <imp@bsd= imp.com> wrote:

> On Mon, Sep 1, 2025 at 5:42=E2=80=AFAM Tomoaki AOKI <junchoon@dec.sakura.ne.jp<= /a>>
> wrote:
>
> > On Mon, 1 Sep 2025 03:15:50 -0600
> > Warner Losh <
imp@bsdimp.com> wrote:
> >
> > > On Mon, Sep 1, 2025, 3:05=E2=80=AFAM Poul-Henning Kamp <<= a href=3D"mailto:phk@phk.freebsd.dk" target=3D"_blank">phk@phk.freebsd.dk>
> > wrote:
> > >
> > > > --------
> > > > Tomoaki AOKI writes:
> > > >
> > > >
> > > > > >=C2=A0 > =E2=80=A6 it would be nice to have= something like 'recovery partition',
> > as
> > > > > > some OSes have. or at least some tiny fail-sa= fe feature. having
> > remote
> > > > > > machine in some distant datacenter, booting f= rom a flashstick is
> > > > always
> > > > > > a problem.
> > > >
> > > > I thought that is what /rescue is for ?
> > > >
> > >
> > > That only works if your boot loader can read it... I've = thought for a
> > > while now that maybe we should move that into a ram disk ima= ge that we
> > fall
> > > back to if the boot loader can't read anything else... > > >
> > > Warner
> >
> > Exactly. If the loader (or bootcode to kick the loader in the
> > partition/pool) can sanely read the partition/pool to boot from,<= br> > > I think /rescue is enough and no need for rescue "partition = / pool".
> >
> > But once the partition / pool to boot is broken (including lost > > decryption key for encrypted partitions/drives from regular place= ),
> > something others are needed.
> >
> > And what can be chosen to boot from BIOS/UEFI firmware depends on=
> > the implementation (some could restrict per-drive only, instead o= f
> > every entry in EFI boot manager table).
> >
> > If BIOS/firmware allow to choose "drive" to boot, rescu= e "drive"
> > is useful, if multiple physical drives are available.
> >
> > Yes, rescue mfsroot embedded into loader.efi would be a candidate= , too,
> > if the size of ESP allows.
>
>
> Rescue is quite small. On the order of 8MB compressed. The trouble is = that
> the kernel is like 12MB compressed, plus we'd need a few more modu= les.
> Still, we could likely get something under 25MB that's an MD image= that we
> could boot into, but it would have to be single user. And It's bee= n a while
> since I did that... Typically I just run /rescue/init or /rescue/sh, w= hich
> isn't a full system and still uses the system's /etc. If we cu= stomized it
> per system, we could do better, since the kernel can be a bit smaller<= br> > (compressed our kernels at work are 6MB), so under 20MB could be possi= ble.
> We'd not need /boot/loader.efi in there.

Oh, much smaller than I've expected!

Actually, using boot1.efi (either stock or patched), users of Root on
ZFS can have rescue UFS partition on the same drive.
This is because it looks for /boot/loader.efi to kick from ZFS pool
first, then, UFS. This is per-drive priority and if both are NOT found,
boot1.efi looks for another drive with the order that UEFI firmware
recognized. (The first to try is the drive boot1.efi itself was kicked.)
This is how smh@ implemented when I requested to fix boot issue
on UEFI boot (at the moment, loader.efi cannot be kicked directly
by UEFI firmware and needed boot1.efi).

This isn't true, at least not generally. We load loader.efi in all new= installations by default. I've fixed a number of issues around this fr= om the past... We're not able to use it at netflix to boot off of ZFS, = for example...
=C2=A0
Maybe Warner would remember, before the fix, boot1.efi always looked for /boot/loader.efi with the order UEFI firmware recognized drives,
thus, even if started from USB memstick for rescue, boot1.efi
"always" kicked the first "internal" drive and cannot r= escue.
Yes, fresh installations was OK with it, as there's no /boot/loader.efi=
in any of internal drives.

Yea, I'm= not remembering it...
=C2=A0
> If we could hook into the arch specific traps that cause segv, etc, we=
> could do a setjmp early and set 'safe mode' and restart.=C2=A0= Though that may
> be trickier than I initially am thinking... maybe the best bet is to l= et
> uefi catch that failure and have the next bootable BootXXXX environmen= t on
> the list specify a safe mode. More investigation might be needed.
>
> Warner

Yeah, and it could be (and would actually be) implementation-specific.
Maybe chaotic in real world and lots of quirks would be required.

I don't understand that part... It would be = architecture specific, but why would it be implementation specific?

Warner=C2=A0
--000000000000fda580063dd675bf--