Re: Kernel modules not loading on 15-prerelease

From: bob prohaska <fbsd_at_www.zefox.net>
Date: Thu, 04 Sep 2025 14:14:11 UTC
On Wed, Sep 03, 2025 at 06:18:03PM -0700, Mark Millard wrote:
> On Sep 3, 2025, at 17:39, bob prohaska <fbsd@www.zefox.net> wrote:
> 
> > Most list  depends on kernel.1500061 (1500061,1500061), perhaps
> > 10% list 63 and one lists some of each (!)
> > /boot/kernel/usb.ko
> >  depends on kernel.1500061 (1500061,1500061)
> >  depends on kernel.1500061 (1500061,1500061)
> >  depends on kernel.1500061 (1500061,1500061)
> >  depends on kernel.1500061 (1500061,1500061)
> >  depends on kernel.1500061 (1500061,1500061)
> >  depends on kernel.1500061 (1500061,1500061)
> >  depends on kernel.1500061 (1500061,1500061)
> >  depends on kernel.1500061 (1500061,1500061)
> >  depends on kernel.1500063 (1500063,1500063)
> >  depends on kernel.1500063 (1500063,1500063)
> > The above looks wrong to me.....
> 
> Lack of META_MODE (with filemon working) and
> WITHOUT_CLEAN tends to mean more things are
> not rebuilt. Likely you are seeing the
> distinction between rebuilt things and
> not-rebuilt things.
> 
> /boot/kernel/usb.ko is certainly interesting.
> Apparently, different parts of it have their
> own dependencies on the kernel and only some
> parts were rebuilt before being linked
> together. (I do not know the details.)
> 
> >> Going in a different direction, I'll remind:
> >> 
> >>     The environment of make(1) for the build can be controlled via the
> >>     SRC_ENV_CONF variable, which defaults to /etc/src-env.conf.  Some
> >>     examples that may only be set in this file are WITH_DIRDEPS_BUILD, and
> >>     WITH_META_MODE, and MAKEOBJDIRPREFIX as they are environment-only
> >>     variables.
> >> 
> >> I'll note that I use a env prefix before a make
> >> command (in a script):
> >> 
> >> env __MAKE_CONF="/usr/home/root/src.configs/make.conf" \
> >> SRCCONF="/dev/null" SRC_ENV_CONF="/usr/home/root/src.configs/src.conf.aarch64-nodbg-clang.aarch64-host" \
> >> MAKEOBJDIRPREFIX="/usr/obj/BUILDs/main-aarch64-usr_src-nodbg-clang" \
> >> WITH_META_MODE=yes \
> >> time -l make $*
> >> 
> >> You reference elsewhere using: -DWITH_META_MODE
> >> That looks wrong to me. For make:
> >> 
> >>     -D variable
> >>             Define variable to be 1, in the global scope.
> >> 
> >> is defining a make variable, not an environment variable.
> >> I'm not saying that you should use -e but, as evidence,
> >> note the wording:
> >> 
> >>     -e      Let environment variables override global variables within
> >>             makefiles.
> >> 
> >> So: global variables are not environment variables.
> >> 
> >> As far as I can tell, you have not been using META_MODE
> >> based on what you report doing.

For the most recent (past week) experiments that's correct.

> >> 
> >> Absent META_MODE use, the system's recent change to using
> >> WITHOUT_CLEAN by default for buildworld and buildkernel
> >> can lead to more lack of updates of files that should be
> >> updated.
> >> 
> >>> At the same time, I see
> >>> # uname -K
> >>> 1500063
> >>> # uname -U
> >>> 1500061
> >> 
> >> -U output is a separate issue from the kernel
> >> and module version requirement mismatches as
> >> far as I can tell.
> >> 
> >> uname gets the -U putput text via:
> >> 
> >> static void                                      
> >> native_uservers(void)
> >> {                                                
> >>        static char buf[128];
> >> 
> >>        snprintf(buf, sizeof(buf), "%d", __FreeBSD_version);
> >>        uservers = buf;
> >> }
> >> 
> >> Recently WITHOUT_CLEAN became the default. So
> >> its status for your build depends on the timing
> >> of the commit that you used.
> >> 
> >> A WITHOUT_CLEAN (non-META_MODE) build might not
> >> have rebuilt uname. What does the following report:
> >> 
> >> # file /usr/obj/usr/src/arm.armv7/usr.bin/uname/uname
> >> 
> >> My paths are different, but here is an example (I've not
> >> built for armv7 in a long time):
> >> 
> >> # file /usr/obj/BUILDs/main-CA7-nodbg-clang/usr/main-src/arm.armv7/usr.bin/uname/uname
> >> /usr/obj/BUILDs/main-CA7-nodbg-clang/usr/main-src/arm.armv7/usr.bin/uname/uname: ELF 32-bit LSB executable, ARM, EABI5 version 1 (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, FreeBSD-style, for FreeBSD 15.0 (1500048), not stripped
> >> 
> >> Note that "(1500048)". Which number does your build
> >> of uname show for what is in your build tree (not
> >> what is installed)?
> >> 
> > # file /usr/obj/usr/src/arm.armv7/usr.bin/uname/uname
> > /usr/obj/usr/src/arm.armv7/usr.bin/uname/uname: ELF 32-bit LSB executable, ARM, EABI5 version 1 (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, FreeBSD-style, for FreeBSD 15.0 (1500063), not stripped
> > #
> 
> Hmm. I'm surprised at the 1500063 when uname -U reported
> 1500061. I wonder if it has since been rebuilt.
> 
> >>> How can the kernel and modules get out of sync?
> >> 
> >> WITHOUT_CLEAN (recently by default) use without META_MODE
> >> to cause updates based on better dependency information.
> >> 
> >>> Even then, /usr/src was updated  followed by an immediate
> >>> buildworld/buildkernel.
> >> 
> >> You do not mention installations explicitly, just
> >> builds. Was the world installed too?
> > Yes, sorry for the ambiguity.
> > 
> >> 
> >>> In that case, how did world and kernel
> >>> become mismatched and remain so for a week or more?
> >> 
> >> See above about correctly providing META_MODE and
> >> about the recent change to have WITHOUT_CLEAN by
> >> default (less rebuilds by default).
> >> 
> >>> The suggestion to delete /usr/obj is looking unavoidable, are
> >>> additional steps warranted?
> >> 
> >> Given that you seem to have been doing doing
> >> builds where META_MODE was not helping to
> >> track dependencies, I'd recommend a from-scratch
> >> META_MODE based build to get the tracking
> >> information fully in place for use in later
> >> builds.
> >> 
> >> If you are to do META_MODE builds, builds that
> >> disable META_MODE mess up its dependency tracking
> >> information by not updating it. Thus, only if
> >> you stop using META_MODE overall should to avoid
> >> using META_MODE for a specific build.
> >> 
> > 
> > When filemon stopped loading I ran a few build/install 
> > cycles without meta mode. I thought that would be harmless, 
> > but maybe not. Right now meta mode is running with the 
> > NO_FILEMON option.
> 
> Good point, META_MODE does not do as much when
> filemon is not operational.  Once filemon is
> operational, you will likely need to synchronize
> META_MODE again so that it gets all the information.
> 
Ahh, that is a significant detail. I expected "no filemon"
to perhaps be slower, but to be functionally equivalent. 


> > I'll let it run till it either finishes or crashes, then
> > decide whether to remove /usr/obj after rebooting.
> > 
> > One possible culprit is of course me: From time to time
> > the machine seemingly stalls, with no response to keyboard
> > nor debugger escape. I've routinely power-cycles the machine
> > to recover, confident that internal checks would catch any
> > inconsistencies that might be introduced by ungraceful
> > shutdown. Might my confidence be misplaced? 
> 
> I do not know that you have many alternatives.

Might there be some program that can independently
audit installed binaries against those in /usr/obj?
> 
> One thing using pkgbase style installs does is leave checksum
> data around for everything pkg installs. "pkg check -sa",
> if it is executable, will try to checksum everything handled
> via pkg and report on mismatches, both base packages and port
> packages. There is no equivalent for any other installation
> style that I know of.

It looks like pkgbase is a subject of much discussion and at
least some contention. Is there a tutorial somewhere of how
to use it in a simple self-hosted system? From the page at
https://wiki.freebsd.org/action/show/pkgbase?action=show&redirect=PkgBase
it seems a good deal more involved than "make installworld" 8-)


Thanks for writing!