How this wired boot timing bug comes, init rc scripts or zfs?

Rick Macklem rmacklem at uoguelph.ca
Mon Oct 5 00:41:43 UTC 2020


Meowthink wrote:
>On Mon, Oct 5, 2020 at 5:37 AM Warner Losh <imp at bsdimp.com> wrote:
>>
>>
>>
>> On Sun, Oct 4, 2020, 3:28 PM Rick Macklem <rmacklem at uoguelph.ca> wrote:
>>>
>>> Meowthink wrote:
>>> >Hello hackers,
>>> >Recently I installworld and rebooted a server, seems working, but my
>>> >kerberized nfsd, precisely gssd, is not functional.
>>> >At first I thought it may be a bug from stable, so I did some trivial
>>> >tests, replacing the kernel with releng one, then the whole world, but
>>> >found this is nothing related to the kernel, and triggers randomly
>>> >when rebooting some recent stable/11 world (releng/11.4 seems fine).
>>> >To dig it deeper, here is what the console showed when failing:
>>> >```
>>> >Oct  3 20:23:59 r kernel: Starting file system checks:
>>> >Oct  3 20:23:59 r kernel: /etc/rc: WARNING: run_rc_command: cannot run
>>> >/usr/sbin/gssd
>>> >Oct  3 20:23:59 r kernel: Mounting local filesystems:.
>>> >Oct  3 20:23:59 r kernel: Updating CPU Microcode...
>>> >Oct  3 20:23:59 r kernel: Done.
>>> >Oct  3 20:23:59 r kernel: Starting ctld.
>>> >Oct  3 20:23:59 r kernel: ctld: bind(2) failed for [::]: Can't assign
>>> >requested address
>>> >Oct  3 20:23:59 r kernel: ctld: bind(2) failed for 0.0.0.0: Can't
>>> >assign requested address
>>> >Oct  3 20:23:59 r kernel: ctld: failed to apply configuration; exiting
>>> >Oct  3 20:23:59 r kernel: /etc/rc: WARNING: failed to start ctld
>>> >Oct  3 20:23:59 r kernel: ELF ldconfig path: /lib /usr/lib
>>> >/usr/lib/compat /usr/local/lib /usr/dt/lib /usr/local/lib/compat
>>> >/usr/local/lib/gcc9 /usr/local/lib/graphviz /usr/local/lib/nss
>>> >/usr/local/lib/perl5/5.28/mach/CORE /usr/local/lib/pth
>>> >/usr/local/lib/qt4 /usr/local/lib/qt5 /usr/local/lib/samba4
>>> >/usr/local/llvm10/lib /usr/local/share/chromium
>>> >Oct  3 20:23:59 r kernel: 32-bit compatibility ldconfig path:
>>> >/usr/lib32 /usr/local/lib32/compat
>>> >Oct  3 20:23:59 r kernel: Setting hostname: r.domain.net.
>>> >Oct  3 20:23:59 r kernel: Setting up harvesting:
>>> >[UMA],[FS_ATIME],SWI,INTERRUPT,NET_NG,NET_ETHER,NET_TUN,MOUSE,KEYBOARD,ATTACH,CACHED
>>> >Oct  3 20:23:59 r kernel: Feeding entropy: .
>>> >Oct  3 20:23:59 r kernel: Starting Network: lo0 bge0 bge1.
>>> >```
>>> >Then I realized I have / and /usr in separated zfs(same zpool). It may
>>> >be that / mounted but /usr not. Thus I changed my /etc/rc.d/gssd line
>>> >7 to # REQUIRE: mountcritlocal.
>>> I don't know what has changed post-11.4 to cause this.
>>> However, the problem with adding "mountcritlocal" is that it assumes
>>> /usr is a locally mounted file system and not NFS mounted nor a subtree
>>> of "/".
>>> --> I think a better solution might be to move gssd to /sbin, which should
>>>       always be a part of the root fs. (/etc/rc.d/gssd already has "root" as
>>>       REQUIRED.)
>>
>>
>> Make the move. It is an early utility in reality.
>>
>> Warner
>
>But take care :)
>$ ldd /usr/sbin/gssd
>/usr/sbin/gssd:
>        libgssapi.so.10 => /usr/lib/libgssapi.so.10 (0x80082b000)
>        libkrb5.so.11 => /usr/lib/libkrb5.so.11 (0x800a35000)
>        libroken.so.11 => /usr/lib/libroken.so.11 (0x800cb4000)
>        libc.so.7 => /lib/libc.so.7 (0x800ec7000)
>        libasn1.so.11 => /usr/lib/libasn1.so.11 (0x80127e000)
>        libcom_err.so.5 => /usr/lib/libcom_err.so.5 (0x801521000)
>        libcrypt.so.5 => /lib/libcrypt.so.5 (0x801723000)
>        libcrypto.so.8 => /lib/libcrypto.so.8 (0x801a00000)
>        libhx509.so.11 => /usr/lib/libhx509.so.11 (0x801e75000)
>        libwind.so.11 => /usr/lib/libwind.so.11 (0x8020c2000)
>        libheimbase.so.11 => /usr/lib/libheimbase.so.11 (0x8022ea000)
>        libprivateheimipcc.so.11 => /usr/lib/libprivateheimipcc.so.11
>(0x8024ee000)
>        libthr.so.3 => /lib/libthr.so.3 (0x8026f1000)
Good point. Moving all those libraries isn't worth the effort, imho.

I suppose "mountcritlocal" is harmless and fixes the case where /usr
is a separately mounted local FS.
(I can't resist pointing out that we no longer need to worry about the
 size limitation of a 2.5Mbyte RK05 disk, so having /usr on a separate
 file system may no longer be a critical requirement, but...;-)

rick

>>
>>
>> I have a similar problem with the rpc.tlsclntd daemon I have developed
>> for NFS-over-TLS.
>> - Neither sec=krb5[ip] nor tls can be used for an NFS mounted root,
>>   but I think we just have to live with that?
>>
>> Do others have any suggestions? rick
>>
>> By the way, /etc/rc.d/ctld to #
>> REQUIRE: netif. Everything works fine, even rebooting several times.
>> What I am confused is how this happens. It seems that /etc/rc.d/gssd
>> (in addition, /etc/rc.d/ctld) hasn't been changed since 2016. Both
>> gssd and init in stable/11 have no functional changes since
>> releng/11.4. Maybe zfs? but it's in the kernel, and kernel r366306
>> with releng/11.4 world works(though I only tested few times since
>> rebooting is too boring).
>> Any ideas?
>>
>> Cheers,
>> meowthink
>> _______________________________________________
>> freebsd-hackers at freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"
>>
>> _______________________________________________
>> freebsd-hackers at freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"
_______________________________________________
freebsd-hackers at freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"



More information about the freebsd-hackers mailing list