How this wired boot timing bug comes, init rc scripts or zfs?

Meowthink meowthink at gmail.com
Sun Oct 4 02:50:34 UTC 2020


Hello hackers,
Recently I installworld and rebooted a server, seems working, but my
kerberized nfsd, precisely gssd, is not functional.
At first I thought it may be a bug from stable, so I did some trivial
tests, replacing the kernel with releng one, then the whole world, but
found this is nothing related to the kernel, and triggers randomly
when rebooting some recent stable/11 world (releng/11.4 seems fine).
To dig it deeper, here is what the console showed when failing:
```
Oct  3 20:23:59 r kernel: Starting file system checks:
Oct  3 20:23:59 r kernel: /etc/rc: WARNING: run_rc_command: cannot run
/usr/sbin/gssd
Oct  3 20:23:59 r kernel: Mounting local filesystems:.
Oct  3 20:23:59 r kernel: Updating CPU Microcode...
Oct  3 20:23:59 r kernel: Done.
Oct  3 20:23:59 r kernel: Starting ctld.
Oct  3 20:23:59 r kernel: ctld: bind(2) failed for [::]: Can't assign
requested address
Oct  3 20:23:59 r kernel: ctld: bind(2) failed for 0.0.0.0: Can't
assign requested address
Oct  3 20:23:59 r kernel: ctld: failed to apply configuration; exiting
Oct  3 20:23:59 r kernel: /etc/rc: WARNING: failed to start ctld
Oct  3 20:23:59 r kernel: ELF ldconfig path: /lib /usr/lib
/usr/lib/compat /usr/local/lib /usr/dt/lib /usr/local/lib/compat
/usr/local/lib/gcc9 /usr/local/lib/graphviz /usr/local/lib/nss
/usr/local/lib/perl5/5.28/mach/CORE /usr/local/lib/pth
/usr/local/lib/qt4 /usr/local/lib/qt5 /usr/local/lib/samba4
/usr/local/llvm10/lib /usr/local/share/chromium
Oct  3 20:23:59 r kernel: 32-bit compatibility ldconfig path:
/usr/lib32 /usr/local/lib32/compat
Oct  3 20:23:59 r kernel: Setting hostname: r.domain.net.
Oct  3 20:23:59 r kernel: Setting up harvesting:
[UMA],[FS_ATIME],SWI,INTERRUPT,NET_NG,NET_ETHER,NET_TUN,MOUSE,KEYBOARD,ATTACH,CACHED
Oct  3 20:23:59 r kernel: Feeding entropy: .
Oct  3 20:23:59 r kernel: Starting Network: lo0 bge0 bge1.
```
Then I realized I have / and /usr in separated zfs(same zpool). It may
be that / mounted but /usr not. Thus I changed my /etc/rc.d/gssd line
7 to # REQUIRE: mountcritlocal. By the way, /etc/rc.d/ctld to #
REQUIRE: netif. Everything works fine, even rebooting several times.
What I am confused is how this happens. It seems that /etc/rc.d/gssd
(in addition, /etc/rc.d/ctld) hasn't been changed since 2016. Both
gssd and init in stable/11 have no functional changes since
releng/11.4. Maybe zfs? but it's in the kernel, and kernel r366306
with releng/11.4 world works(though I only tested few times since
rebooting is too boring).
Any ideas?

Cheers,
meowthink


More information about the freebsd-hackers mailing list