Re: nanobsd [was Re: Cross compiling user applications for armv7]

In reply to: Karl Denninger : "Re: nanobsd [was Re: Cross compiling user applications for armv7]"
Go to: [ bottom of page ] [ top of archives ] [ this month ]
From: Warner Losh <imp_at_bsdimp.com>
Date: Sun, 21 Sep 2025 21:49:56 UTC
On Sun, Sep 21, 2025 at 1:31 PM Karl Denninger <karl@denninger.net> wrote:

> On 9/21/2025 15:01, Warner Losh wrote:
>
>
>
> On Sun, Sep 21, 2025 at 7:40 AM Karl Denninger <karl@denninger.net> wrote:
>
>> On 9/20/2025 19:19, Sulev-Madis Silber wrote:
>>
>> On September 20, 2025 2:34:06 PM GMT+03:00, Karl Denninger <karl@denninger.net> <karl@denninger.net> wrote:
>> ...
>>
>> There are ways to have that also be two-root-partition allowing "near-line" updates (update the other partition with the new OS code then reboot to activate it) provided you have a deterministic way to know which device the loader will boot from.  On EFI boot machines this can be problematic to obtain deterministically once the system is running..
>>
>>
>> there was some discussion somewhere about boot switching troubles
>>
>> unsure if even gpt helps here with it's dual part tables
>>
>> in my case i replace env files in efi part to control boot switch. it's a bad hack. i use cp, sync, mv, sync, etc magic to make it more power fail resistant
>>
>> i wish there's some sane way to do that. maybe loader could have changes. so you don't need to muck around with currdev & rootdev and what else. perhaps boot by ufs label?
>>
>> in my case i finally settled on ufs label of rootfs-<unixtimestamp>. my approach writes full raw fs images which stay unmodified
>>
>> and what about zfs? i might also need to have double pool system for resiliance
>>
>> i battled with this before efi too, 10y ago, in embedded. then it was fun if somewhere between uboot stages and fbsd loader / kernel, the boot order magically changes
>>
>> unsure if zfs be could be used here and is it enough. is it better in embedded? zfs also has benefits like copies=3 and compression. and ability to withstand power failures. unsure about which extent but on ufs i was once cursing on 0b file. but at least on zfs the bootfs is a metadata and that's much better than file on fs
>>
>> any opinions here?
>>
>> The basic problem is that the EFI loader has its own ideas about the
>> enumeration order of the devices on the machine, and you don't know what
>> they'll be.  If you want a "universal" media that will boot on legacy (no
>> EFI *possible*, which is the case for some such as pcEngines boxes) *and*
>> will work on EFI you have a quandary.
>>
>> I fixed the build issue in that such boxes typically can't boot GPT media
>> either, but that is fixable because you can still have a partition layout
>> that looks like this on MBR:
>>
>> 1. Partition "1"
>> 2. Partition "2"
>> 3. EFI (ignored for a non-EFI box)
>> 4. "Data partition" which is then sub-partititoned into "cfg" and "data"
>>
>> Looks like this on a USB stick when running:
>>
>> =>      63  60125121  da0  MBR  (29G)
>>         63  11257500    1  freebsd  [active]  (5.4G)
>>   11257563  11257500    2  freebsd  (5.4G)
>>   22515063     81920    3  efi  (40M)
>>   22596983    840517    4  freebsd  (410M)
>>   23437500  36687684       - free -  (17G)
>>
>> =>       0  11257500  da0s1  BSD  (5.4G)
>>          0        16         - free -  (8.0K)
>>         16  11257484      1  freebsd-ufs  (5.4G)
>>
>> =>       0  11257500  da0s2  BSD  (5.4G)
>>          0        16         - free -  (8.0K)
>>         16  11257484      1  freebsd-ufs  (5.4G)
>>
>> =>     0  840517  da0s4  BSD  (410M)
>>        0   62500      1  freebsd-ufs  (31M)
>>    62500  750000      4  freebsd-ufs  (366M)
>>   812500   28017         - free -  (14M)
>>
>> For an MBR/CSM boot (non-EFI) you simply set the "active" partition after
>> updating the other and that one is booted -- that works as it has always,
>> in that it tells the system what to boot.  The same is true if you use GPT
>> with the "bootme" flag.  The problem with EFI is that you need to know what
>> the EFI loader will call the disk so you can set "rootdev=...s1a" or "s2a"
>> since the EFI loader ignores the partition "active" marker, particularly if
>> you want "one build that works even on systems with no EFI or capacity to
>> boot a GPT disk."
>>
> There is a GPTBOOT.EFI that replicates the old gptboot protocol, but it's
> rather fragile so isn't enabled by default. It was written for the
> ping-pong setup where you can't rely on EFI env vars to drive the EFI boot
> manager, but instead mark the partitions. IMHO, though, it's really no
> different than setting a file in the ESP the loader reads, and the latter
> is more generic.  gptboot.efi, though, is a good place to look if you want
> to do ping-pong on GPT booted machines.
>
>> I have found no deterministic way to know what that will be (e.g. "disk0"
>> is the obvious, but that makes a presumption -- there is no other media
>> that could be enumerated.  What if there is?) once the box is booted and
>> running.  There is nothing visible in sysctl, for example, that tells me
>> deterministically where the loader got the running system's root from.
>>
> EFI variables tell you that.
>
> cfee69ad-a0de-47a9-93a8-f63106f8ae99-LoaderPath
> \EFI\FREEBSD\LOADER.EFI
>
> cfee69ad-a0de-47a9-93a8-f63106f8ae99-LoaderDev
>
> PciRoot(0x1)/Pci(0x1,0x1)/Pci(0x0,0x0)/NVMe(0x1,00-00-00-00-00-00-00-00)/HD(1,GPT,B05DF68B-625D-11EB-81AA-E0D55E1E73BD,0x28,0x96000)
>
> You can then take the HD(...) and match it to the efimedia that geom
> publishes to find this. This is the loaddev from the bootloader, but not
> the partition that we booted off of. The loader is responsible for setting
> vfs.root.mountfrom to tell the kernel where to get its root from. We do
> this by looking at /etc/fstab on the load device for the / entry since the
> loader doesn't know how to translate loader name space to FreeBSD name
> space. The EFI loader can get at the UEFI path, which we also export in
> various places like geom and devinfo.
>
> We could trivially add a -KernelDev and -KernelPath EFI variables to the
> mix.
>
> For ZFS, it's just the BE. And you get out of the ping/pong hell by making
> all that well managed, and off-line upgradeable.
>
>> If you're willing to build EFI-only and ZFS, for example, then you could
>> use the "bootfs" pool property for this and that should work as expected
>> (beadm does this).  But on small-RAM systems ZFS is ill-advised and a lot
>> of "embedded" applications are small-RAM.....
>>
> Does FreeBSD even run on a system with less than 1GB?
>
> It does run perfectly well on 1Gb machines, specifically Pi3s (which are
> aarch64)
>
Great!  I was asking about smaller machines. Although we can boot on
smaller machines, doing anything at all taxing is difficult.

> ---<<BOOT>>---
> WARNING: Cannot find freebsd,dts-version property, cannot check DTB
> compliance
> Copyright (c) 1992-2023 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>         The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 14.3-STABLE #0 stable/14-n271913-1821af77efef-dirty: Fri Jul 11
> 08:45:25 EDT 2025
>
> karl@NewFS.denninger.net:/work/OBJ/ARM64-14-STABLE/obj/usr/src.14-STABLE/arm64.aarch64/sys/GENERIC
> arm64
> FreeBSD clang version 19.1.7 (https://github.com/llvm/llvm-project.git
> llvmorg-19.1.7-0-gcd708029e0b2)
> VT(efifb): resolution 656x416
> module scmi already present!
> real memory  = 994041856 (947 MB)
> avail memory = 945131520 (901 MB)
> Starting CPU 1 (1)
> Starting CPU 2 (2)
> Starting CPU 3 (3)
> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
>
> Zfs is a material overhead on a machine of this configuration particularly
> considering that it is a "nanobsd" environment.  I could build a zfs root
> filesystem, figure out how to send/receive that into the box and then set
> where it boots from on the pool as a means of "ping-pong" updating but I
> haven't tried loading zfs on these little things; my understanding is that
> on <4Gb machines that is unwise and a microSD card is a pretty
> low-performance thing as well, in addition to having a tendency to get
> quite-unhappy if written to a great deal in non-aligned small block
> transactions. (I've collected quite a few dead cards this way over the last
> 10ish years.)
>
It's my belief that the I/O scheduler in ZFS schedules writes so that
smaller block transactions aren't a problem. And log structured
filesystems are nicer on flash. And the RPI3's SD support is pretty good in
hardware on all but the crappiest cards. <4GB does often require tuning to
use ZFS, though. I mentioned zfs in passing, though, since I think it's a
more robust solution when one can use it. In addition, one need not do zfs
send/receive to do the updates. "tar xvf" is perfectly fine since we don't
have to deploy a new filesystem like we used to with the ping-pong
partitions.


> It would be nice if the EFI loader passed to the kernel where it loaded
>> from (e.g. its idea of "rootdev" at the time it ran) which the kernel could
>> then stash that in a sysctl-visible place.  That doesn't prevent someone
>> from screwing it up by plugging in some other device to the box (thus
>> potentially changing the EFI BIOS enumeration order) but so long as the
>> physical configuration doesn't change that should be good enough.
>>
>
> I'm not sure how that helps. It already sets vfs.mount_from which you can
> get to via the kenv program. But it isn't the loader's notion of diskXXX,
> which, honestly, in an EFI world that can be fraught.
>
> But in order for it to work you need to know the diskXXX it loaded from;
> that is, when you ping-pong you need to set, in the EFI partition, a
> loader.env file with:
>
> "rootdev=disk0s1a"
>
> (or s2a, or whatever) so the next time it boots the loader grabs off the
> right partition.
>
With kenv currdev, on a UFS system, you can get the currently booted system
and do the pattern matching to go from disk0s1a to disk0s2a and vice versa
today.

> If its ZFS then yes it can be constructed to work but if its not then it
> doesn't.  If the EFI loader, in the *absence* of a "rootdev" entry in
> loader.env (which obviously should take precedence if set there) was to
> look for the "active" flag for MBR partitions (or "bootme" for GPT
> partitions) then it would work for UFS as well, but the EFI loader (unless
> something has recently changed, and I don't think it has) currently ignores
> that and boots the first partition it finds that appears to be bootable --
> which makes ping-pong not work unless you override it in loader.env within
> the EFI partition
>
gptboot.efi has been around a few years and does exactly this, though it's
geared as a gptboot replacement so it will ping/pong between partition
based on the freebsd-specific bootme and bootonce parameters being applied
to the GPT table. We use this at work to ping-pong between two different
installer systems at our system integrator that builds the OCAs. It's a
hack, and it's better to use the "uefi_rootdev" variable to specify exactly
what to boot using a UEFI device path (in that case, you don't need to know
the loader's namespace). We didn't do this for reasons that are specific to
our environment...

Warner