What is OpenZFS doing during boot?

Fri Apr 30 13:21:33 UTC 2021

Hi folks, this is a stable/13 question but I figured it's still close
enough to -CURRENT to count.

So I wanted to update my (remote) system with freebsd-update, but that
installed half a kernel and bricked the machine upon reboot. Lucky me I
fixed OOB access just the day before.

Did the usual world/kernel build and ran etcupdate, merging in my
local changes. This bricked the system again, as it removed the -x bit on
/etc/rc.d/netif, I filed
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=255514 for that though (I
never had such trouble with mergemaster, just even understanding what
etcupdate is trying to do and how to bootstrap it is a mystery to me).

Anyway, I have a data zpool on 2x encrypted GELI providers that I can only
unlock (and zpool import) with 2 passphrases after the system has booted.

Color me surprised when some RC script thought otherwise and tried to
import the pool during boot. Why does it do that, that's not supposed to
work and it should not even touch the encrypted bits (yet).

mountroot: waiting for device /dev/mirror/gm0a...
Dual Console: Serial Primary, Video Secondary
GEOM_ELI: Device gpt/swap0.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: accelerated software
GEOM_ELI: Device gpt/swap1.eli created.
GEOM_ELI: Encryption: AES-XTS 128
GEOM_ELI:     Crypto: accelerated software
Setting hostuuid: d7902500-4c7c-0706-0025-90d77c4c0e0f.
Setting hostid: 0x8a2b4277.
cannot import 'data': no such pool or dataset
        Destroy and re-create the pool fipmi0: Unknown IOCTL 40086481
ipmi0: Unknown IOCTL 40086481
rom
        a backup source.
cachefile import failed, retrying
nvpair_value_nvlpid 69 (zpool), jid 0, uid ist(nvp, &rv) == 0 (0x16 == 0)
ASSERT at /usr/src/sys/contrib/openzfs/module/nv0: exited on signal 6
pair/fnvpair.c:586:fnvpair_value_nvlist()Abort trap
cannot import 'data': no such pool or dataset
        ipmi0: Unknown IOCTL 40086481
ipmi0: Unknown IOCTL 40086481
Destroy and re-cpid 74 (zpool), jid 0, uid 0: exited on signal 6
reate the pool from
        a backup source.
cachefile import failed, retrying
nvpair_value_nvlist(nvp, &rv) == 0 (0x16 == 0)
ASSERT at
/usr/src/sys/contrib/openzfs/module/nvpair/fnvpair.c:586:fnvpair_value_nvlist()Abort
trap
Starting file system checks:
/dev/mirror/gm0a: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/mirror/gm0a: clean, 370582 free (814 frags, 46221 blocks, 0.2%
fragmentation)
/dev/mirror/gm0d: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/mirror/gm0d: clean, 867640 free (1160 frags, 108310 blocks, 0.1%
fragmentation)
/dev/mirror/gm0e: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/mirror/gm0e: clean, 1267948 free (17228 frags, 156340 blocks, 0.7%
fragmentation)
Mounting local filesystems:.

What do I need to do to _not_ have any zpool operations be attempted during
startup? How does it even know of the existence of that pool?

I guess it's zfs_enable=NO to stop /etc/rc.d/zpool from messing about. But
more importantly, the GELI providers don't exist yet, why does it then
segfault? Shouldn't it be a bit more robust on that front?

Thanks all
Uli