Serious ZFS Bootcode Problem (GPT NON-UEFI)
list_freebsd at bluerosetech.com
Mon Feb 11 10:19:26 UTC 2019
On 02/09/2019 14:30, Karl Denninger wrote:
> FreeBSD 12.0-STABLE r343809
> After upgrading to this (without material incident) zfs was telling me
> that the pools could be upgraded (this machine was running 11.1, then 11.2.)
> I did so, /and put the new bootcode on with gpart bootcode -b /boot/pmbr
> -p /boot/gptzfsboot -i .... da... /on both of the candidate (mirrored
> ZFS boot disk) devices, in the correct partition.
> Then I rebooted to test and..... /could not find the zsboot pool
> containing the kernel./
> I booted the rescue image off my SD and checked -- the copy of
> gptzfsboot that I put on the boot partition is exactly identical to the
> one on the rescue image SD.
> Then, to be /absolutely sure /I wasn't going insane I grabbed the
> mini-memstick img for 12-RELEASE and tried THAT copy of gptzfsboot.
> /Nope; that won't boot either!/
> Fortunately I had a spare drive slot so I stuck in a piece of spinning
> rust, gpart'ed THAT with an old-style UFS boot filesystem, wrote
> bootcode on that, mounted the ZFS "zsboot" filesystem and copied it
> over. That boots fine (of course) and mounts the root pool, and off it
> I'm going to blow away the entire /usr/obj tree and rebuild the kernel
> to see if that gets me anything that's more-sane, but right now this
> looks pretty bad.
> BTW just to be absolutely sure I blew away the entire /usr/obj directory
> and rebuilt -- same size and checksum on the binary that I have
> installed, so.....
> Not sure what's going on here -- did something get moved?
I smashed my head against the wall for days with a very similar-sounding
problem: pure ZFS with a GELI root and separate /boot pool that would
not import the /boot pool at boot, resulting in the kernel not having
the keys to attach the GELI+ZFS root.
That configuration needs some extra bits in loader.conf so that
zpool.cache and the GELI keys get loaded for the kernel by the loader.
This loads the zpool.cache into the kernel so it imports everything
before /etc/rc.d/zfs can run (the case where you have a ZFS /boot that
isn't imported after a reboot:
Run geli init with -b so the providers are flagged for attachment at
boot (instead of by /etc/rc.d/geli), then add this for every GELI
provider you want the kernel to attach before starting the userland:
FOO can be any alphanumeric string, and needs to be consistent for all
three lines and unique per device. The "devicename" is gpt/BAR for a
device with a GPT label of BAR. It can also be the unlabeled device
(e.g., da0p3), but using GPT labels is recommended because it makes the
keys follow a device renumber.
For example, my GELI+ZFS root is a mirror of partitions with nvmezfs0
and nvmezfs1 GPT labels, so I have in my loader.conf:
If you use GPT labels, you can safely ignore the "GEOM_ELI: Found no key
files in loader.conf for DEVICE" messages where DEVICE is the unlabeled
device--the GELI module doesn't currently recognize that the unlabeled
and labeled devices are the same provider.
This doesn't appear to be documented in the Handbook or any man pages
that I could find. The zpool_cache_load trick is mentioned in a FreeBSD
wiki page, and the geli_* config is pulled from the zfsboot script
used by bsdinstall to install a pure-ZFS system with GELI root.
I'm not sure if this is exactly your problem, but maybe it helps?
More information about the freebsd-stable