Fwd: Serious ZFS Bootcode Problem (GPT NON-UEFI)

Karl Denninger karl at denninger.net
Sun Feb 10 17:37:17 UTC 2019


On 2/10/2019 09:28, Allan Jude wrote:
> Are you sure it is non-UEFI? As the instructions you followed,
> overwriting da0p1 with gptzfsboot, will make quite a mess if that
> happens to be the EFI system partition, rather than the freebsd-boot
> partition.

Absolutely certain.  The system board in this machine (and a bunch I
have in the field) are SuperMicro X8DTL-IFs which do not support UEFI at
all (they have no available EFI-capable bios.)

They have encrypted root pools but due to the inability of gptzfsboot to
read them they have a small freebsd-zfs partition that, when upgraded, I
copy /boot/* to after the kernel upgrade is done but before they are
rebooted.  That partition is not mounted during normal operation; it's
only purpose is to load the kernel (and pre-boot .kos such as geli.)

> Can you show 'gpart show' output?
[karl at NewFS ~]$ gpart show da1
=>       34  468862061  da1  GPT  (224G)
         34       2014       - free -  (1.0M)
       2048       1024    1  freebsd-boot  (512K)
       3072       1024       - free -  (512K)
       4096   20971520    2  freebsd-zfs  [bootme]  (10G)
   20975616  134217728    3  freebsd-swap  (64G)
  155193344  313667584    4  freebsd-zfs  (150G)
  468860928       1167       - free -  (584K)

Partition "2" is the one that should boot.

There is also a da2 that has an identical layout (mirrored; the drives
are 240Gb Intel 730 SSDs)

> What is the actual boot error?

It says it can't load the kernel and gives me a prompt.  "lsdev" shows
all the disks and all except the two (zfs mirror) that have the "bootme"
partition on them don't show up as zfs pools at all (they're
geli-encrypted, so that's not unexpected.)  I don't believe the loader
ever gets actually loaded.

An attempt to use "ls" from the bootloader to look inside that "bootme"
partition fails; gptzfsboot cannot get it open.

My belief was that I screwed up and wrote the old 11.1 gptzfsboot to the
freebsd-boot partition originally -- but that is clearly not the case.

Late last night I took my "rescue media" (which is a "make memstick"
from the build of -STABLE), booted that on my sandbox machine, stuck two
disks in there and made a base system -- which booted.  Thus whatever is
going on here it is not as simple as it first appears as that system had
the spacemap_v2 flag on and active once it came up.

This may be my own foot-shooting since I was able to make a bootable
system on my sandbox using the same media (a clone hardware-wise so also
no EFI) -- there may have been some part of the /boot hierarchy that
didn't get copied over, and if so that would explain it.

Update: Indeed that appears to be what it was -- a couple of the *other*
files in the boot partition didn't get copied from the -STABLE build
(although the entire kernel directory did)....  I need to look at why
that happened as the update process is my own due to the dual-partition
requirement for booting with non-EFI but that's not your problem -- it's
mine.

Sorry about this one; turns out to be something in my update scripts
that failed to move over some of the files to the non-encrypted /boot....

BTW am I correct that gptzfsboot did *not* get the ability to read
geli-encrypted pools in 12.0?  The UEFI loader does know how (which I'm
using on my laptop) but I was under the impression that for non-UEFI
systems you still needed the unencrypted boot partition from which to
load the kernel.

-- 
Karl Denninger
karl at denninger.net <mailto:karl at denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4897 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20190210/f23e6c5e/attachment.bin>


More information about the freebsd-stable mailing list