Re: zpools no longer exist after boot

From: Ronald Klop <ronald-lists_at_klop.ws>
Date: Fri, 29 Nov 2024 14:49:27 UTC
Van: Dennis Clarke <dclarke@blastwave.org>
Datum: donderdag, 28 november 2024 15:45
Aan: Alan Somers <asomers@freebsd.org>
CC: Current FreeBSD <freebsd-current@freebsd.org>
Onderwerp: Re: zpools no longer exist after boot
> 
> On 11/28/24 08:52, Alan Somers wrote:
> > On Thu, Nov 28, 2024, 7:06AM Dennis Clarke <dclarke@blastwave.org> wrote:
> >
> >>
> >> This is a baffling problem wherein two zpools no longer exist after
> >> boot. This is :
> .
> .
> .
> > Do you have zfs_enable="YES" set in /etc/rc.conf? If not then nothing will
> > get imported.
> >
> > Regarding the cachefile property, it's expected that "zpool import" will
> > change it, unless you do "zpool import -O cachefile=whatever".
> >
> 
> The rc script seems to do something slightly different with zpool import -c $FOOBAR thus :
> 
> 
> titan# cat  /etc/rc.d/zpool
> #!/bin/sh
> #
> #
> 
> # PROVIDE: zpool
> # REQUIRE: hostid disks
> # BEFORE: mountcritlocal
> # KEYWORD: nojail
> 
> . /etc/rc.subr
> 
> name="zpool"
> desc="Import ZPOOLs"
> rcvar="zfs_enable"
> start_cmd="zpool_start"
> required_modules="zfs"
> 
> zpool_start()
> {
>          local cachefile
> 
>          for cachefile in /etc/zfs/zpool.cache /boot/zfs/zpool.cache; do
>                  if [ -r $cachefile ]; then
>                          zpool import -c $cachefile -a -N
>                          if [ $? -ne 0 ]; then
>                                  echo "Import of zpool cache ${cachefile} failed," \
>                                      "will retry after root mount hold release"
>                                  root_hold_wait
>                                  zpool import -c $cachefile -a -N
>                          fi
>                          break
>                  fi
>          done
> }
> 
> load_rc_config $name
> run_rc_command "$1"
> titan#
> 
> 
> 
> I may as well nuke the pre-existing cache file and start over :
> 
> 
> titan# ls -l /etc/zfs/zpool.cache /boot/zfs/zpool.cache
> -rw-r--r--  1 root wheel 1424 Jan 16  2024 /boot/zfs/zpool.cache
> -rw-r--r--  1 root wheel 4960 Nov 28 14:15 /etc/zfs/zpool.cache
> titan#
> titan#
> titan# rm /boot/zfs/zpool.cache
> titan# zpool set cachefile="/boot/zfs/zpool.cache" t0
> titan#
> titan# ls -l /boot/zfs/zpool.cache
> -rw-r--r--  1 root wheel 1456 Nov 28 14:27 /boot/zfs/zpool.cache
> titan#
> titan# zpool set cachefile="/boot/zfs/zpool.cache" leaf
> titan#
> titan# ls -l /boot/zfs/zpool.cache
> -rw-r--r--  1 root wheel 3536 Nov 28 14:28 /boot/zfs/zpool.cache
> titan#
> titan# zpool set cachefile="/boot/zfs/zpool.cache" proteus
> titan#
> titan# ls -l /boot/zfs/zpool.cache
> -rw-r--r--  1 root wheel 4960 Nov 28 14:28 /boot/zfs/zpool.cache
> titan#
> titan# zpool get cachefile t0
> NAME  PROPERTY   VALUE                  SOURCE
> t0    cachefile  /boot/zfs/zpool.cache  local
> titan#
> titan# zpool get cachefile leaf
> NAME  PROPERTY   VALUE                  SOURCE
> leaf  cachefile  /boot/zfs/zpool.cache  local
> titan#
> titan# zpool get cachefile proteus
> NAME     PROPERTY   VALUE                  SOURCE
> proteus  cachefile  /boot/zfs/zpool.cache  local
> titan#
> 
> titan#
> titan# reboot
> Nov 28 14:34:05 Waiting (max 60 seconds) for system process `vnlru' to stop... done
> Waiting (max 60 seconds) for system process `syncer' to stop...
> Syncing disks, vnodes remaining... 0 0 0 0 0 0 done
> All buffers synced.
> Uptime: 2h38m57s
> GEOM_MIRROR: Device swap: provider destroyed.
> GEOM_MIRROR: Device swap destroyed.
> uhub5: detached
> uhub1: detached
> uhub4: detached
> uhub2: detached
> uhub3: detached
> uhub6: detached
> uhub0: detached
> ix0: link state changed to DOWN
> .
> .
> .
> 
> Starting iscsid.
> Starting iscsictl.
> Clearing /tmp.
> Updating /var/run/os-release done.
> Updating motd:.
> Creating and/or trimming log files.
> Starting syslogd.
> No core dumps found.
> Starting local daemons:failed to open cache file: No such file or directory
> .
> Starting ntpd.
> Starting powerd.
> Mounting late filesystems:.
> Starting cron.
> Performing sanity check on sshd configuration.
> Starting sshd.
> Starting background file system
> FreeBSD/amd64 (titan) (ttyu0)
> 
> login: root
> Password:
> Nov 28 14:36:29 titan login[4162]: ROOT LOGIN (root) ON ttyu0
> Last login: Thu Nov 28 14:33:45 on ttyu0
> FreeBSD 15.0-CURRENT (GENERIC-NODEBUG) #1 main-n273749-4b65481ac68a-dirty: Wed Nov 20 15:08:52 GMT 2024
> 
> Welcome to FreeBSD!
> 
> Release Notes, Errata: https://www.FreeBSD.org/releases/
> Security Advisories:   https://www.FreeBSD.org/security/
> FreeBSD Handbook:      https://www.FreeBSD.org/handbook/
> FreeBSD FAQ:           https://www.FreeBSD.org/faq/
> Questions List:        https://www.FreeBSD.org/lists/questions/
> FreeBSD Forums:        https://forums.FreeBSD.org/
> 
> Documents installed with the system are in the /usr/local/share/doc/freebsd/
> directory, or can be installed later with:  pkg install en-freebsd-doc
> For other languages, replace "en" with a language code like de or fr.
> 
> Show the version of FreeBSD installed:  freebsd-version ; uname -a
> Please include that output and any error messages when posting questions.
> Introduction to manual pages:  man man
> FreeBSD directory layout:      man hier
> 
> To change this login announcement, see motd(5).
> You have new mail.
> titan#
> titan# zpool list
> NAME      SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP HEALTH  ALTROOT
> leaf     18.2T   984K  18.2T        -         -     0%     0%  1.00x ONLINE  -
> proteus  1.98T   361G  1.63T        -         -     1%    17%  1.00x ONLINE  -
> t0        444G  91.2G   353G        -         -    27%    20%  1.00x ONLINE  -
> titan#
> 
> This is progress ... however the cachefile property is wiped out again :
> 
> titan# zpool get cachefile t0
> NAME  PROPERTY   VALUE      SOURCE
> t0    cachefile  -          default
> titan# zpool get cachefile leaf
> NAME  PROPERTY   VALUE      SOURCE
> leaf  cachefile  -          default
> titan# zpool get cachefile proteus
> NAME     PROPERTY   VALUE      SOURCE
> proteus  cachefile  -          default
> titan#
> 
> Also, strangely, none of the filesystem in proteus are mounted :
> 
> titan#
> titan# zfs list -o name,exec,checksum,canmount,mounted,mountpoint -r proteus
> NAME                EXEC  CHECKSUM   CANMOUNT  MOUNTED  MOUNTPOINT
> proteus             on    sha512     on        no       none
> proteus/bhyve       off   sha512     on        no       /bhyve
> proteus/bhyve/disk  off   sha512     on        no       /bhyve/disk
> proteus/bhyve/isos  off   sha512     on        no       /bhyve/isos
> proteus/obj         on    sha512     on        no       /usr/obj
> proteus/src         on    sha512     on        no       /usr/src
> titan#
> 
> If I reboot again without doing anything will the zpools re-appear ?
> 
> 
> titan#
> titan# Nov 28 14:37:08 titan su[4199]: admsys to root on /dev/pts/0
> 
> titan# reboot
> Nov 28 14:40:29 Waiting (max 60 seconds) for system process `vnlru' to stop... done
> Waiting (max 60 seconds) for system process `syncer' to stop...
> Syncing disks, vnodes remaining... 0 0 0 0 0 done
> All buffers synced.
> Uptime: 4m50s
> GEOM_MIRROR: Device swap: provider destroyed.
> GEOM_MIRROR: Device swap destroyed.
> uhub4: detached
> uhub1: detached
> uhub5: detached
> uhub0: detached
> uhub3: detached
> uhub6: detached
> uhub2: detached
> ix0: link state changed to DOWN
> .
> .
> .
> Starting iscsid.
> Starting iscsictl.
> Clearing /tmp.
> Updating /var/run/os-release done.
> Updating motd:.
> Creating and/or trimming log files.
> Starting syslogd.
> No core dumps found.
> Starting local daemons:failed to open cache file: No such file or directory
> .
> Starting ntpd.
> Starting powerd.
> Mounting late filesystems:.
> Starting cron.
> Performing sanity check on sshd configuration.
> Starting sshd.
> Starting background file system
> FreeBSD/amd64 (titan) (ttyu0)
> 
> login: root
> Password:
> Nov 28 14:43:01 titan login[4146]: ROOT LOGIN (root) ON ttyu0
> Last login: Thu Nov 28 14:36:29 on ttyu0
> FreeBSD 15.0-CURRENT (GENERIC-NODEBUG) #1 main-n273749-4b65481ac68a-dirty: Wed Nov 20 15:08:52 GMT 2024
> 
> Welcome to FreeBSD!
> 
> Release Notes, Errata: https://www.FreeBSD.org/releases/
> Security Advisories:   https://www.FreeBSD.org/security/
> FreeBSD Handbook:      https://www.FreeBSD.org/handbook/
> FreeBSD FAQ:           https://www.FreeBSD.org/faq/
> Questions List:        https://www.FreeBSD.org/lists/questions/
> FreeBSD Forums:        https://forums.FreeBSD.org/
> 
> Documents installed with the system are in the /usr/local/share/doc/freebsd/
> directory, or can be installed later with:  pkg install en-freebsd-doc
> For other languages, replace "en" with a language code like de or fr.
> 
> Show the version of FreeBSD installed:  freebsd-version ; uname -a
> Please include that output and any error messages when posting questions.
> Introduction to manual pages:  man man
> FreeBSD directory layout:      man hier
> 
> To change this login announcement, see motd(5).
> You have new mail.
> titan#
> titan# zpool list
> NAME      SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP HEALTH  ALTROOT
> leaf     18.2T  1.01M  18.2T        -         -     0%     0%  1.00x ONLINE  -
> proteus  1.98T   361G  1.63T        -         -     1%    17%  1.00x ONLINE  -
> t0        444G  91.2G   353G        -         -    27%    20%  1.00x ONLINE  -
> titan#
> titan# zfs list -o name,exec,checksum,canmount,mounted,mountpoint -r proteus
> NAME                EXEC  CHECKSUM   CANMOUNT  MOUNTED  MOUNTPOINT
> proteus             on    sha512     on        no       none
> proteus/bhyve       off   sha512     on        no       /bhyve
> proteus/bhyve/disk  off   sha512     on        no       /bhyve/disk
> proteus/bhyve/isos  off   sha512     on        no       /bhyve/isos
> proteus/obj         on    sha512     on        no       /usr/obj
> proteus/src         on    sha512     on        no       /usr/src
> titan#
> 
> OKay so the zpools appear to be back in spite of the strange situation with the cachefile property is empty everywhere.  My guess is the zpool
> rc script is bring in information during early boot.
> 
> Why the zfs filesystems on proteus do not mount? Well that is a strange problem but at least the zpool can be used.
> 
> -- 
> --
> Dennis Clarke
> RISC-V/SPARC/PPC/ARM/CISC
> UNIX and Linux spoken
> 
>  
> 
> 
> 


Hi,

The output you provide contains this line:
"Starting local daemons:failed to open cache file: No such file or directory"

Where does that output come from? What is in your file /etc/rc.local file?

Regards,
Ronald.