Re: "options MAXMEMDOM=2" vs. amd64 DBG kernel booting: 3000+ "kernel: Process (pid 1) got signal 5" notices during booting

From: Mark Millard <marklmi_at_yahoo.com>
Date: Thu, 23 Nov 2023 21:28:26 UTC
On Nov 21, 2023, at 21:43, Mark Millard <marklmi@yahoo.com> wrote:

> While my kernel/world build procedures build both DBG and NODBG
> kernels and worlds, I normally run the NODBG kernel and world,
> using DBG only when I need to for problem investigation.
> 
> I recently had reason to use the DBG kernel and found it got the
> oddity of 3000+ instances of "kernel: Process (pid 1) got signal 5"
> during booting, as reported in /var/log/messages . An example is:
> 
> . . .
> Nov 20 23:13:09 7950X3D-UFS shutdown[20174]: reboot by root: 
> Nov 20 23:13:09 7950X3D-UFS syslogd: exiting on signal 15
> Nov 20 23:14:21 7950X3D-UFS syslogd: kernel boot file is /boot/kernel/kernel
> Nov 20 23:14:21 7950X3D-UFS kernel: got signal 5
> Nov 20 23:14:21 7950X3D-UFS kernel: Process (pid 1) got signal 5
> Nov 20 23:14:21 7950X3D-UFS syslogd: last message repeated 3133 times
> Nov 20 23:14:21 7950X3D-UFS kernel: intsmb0: <AMD FCH SMBus Controller> at device 20.0 on pci0
> . . .
> 
> This stopped when I commented out the:
> 
> options        MAXMEMDOM=2
> 
> that I've had historically and built, installed, and booted
> the resulting DBG kernel.
> 
> I'll note that I never had the messages for the NODDBG kernel,
> despite it also having that line.
> 
> 
> For reference:
> 
> # uname -apKU
> FreeBSD 7950X3D-UFS 15.0-CURRENT FreeBSD 15.0-CURRENT #2 main-n266130-d521abdff236-dirty: Tue Nov 21 21:03:11 PST 2023     root@7950X3D-UFS:/usr/obj/BUILDs/main-amd64-dbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-DBG amd64 amd64 1500002 1500002
> 
> # ~/fbsd-based-on-what-commit.sh -C /usr/main-src/
> d521abdff236 (HEAD -> main, freebsd/main, freebsd/HEAD) Update ASLR stack sysctl description in security.7 and mitigations.7
> Author:     Ed Maste <emaste@FreeBSD.org>
> Commit:     Ed Maste <emaste@FreeBSD.org>
> CommitDate: 2023-10-24 22:29:25 +0000
> branch: main
> merge-base: d521abdff2367a5c72a773a815fc3d99403274f5
> merge-base: CommitDate: 2023-10-24 22:29:25 +0000
> n266130 (--first-parent --count for merge-base)
> 

A few more related notes follow.

First off, while I've historically built with NUMA
support present in the kernel, I'd only ever
configured a UEFI/system to indicate NUMA back in
some experiments in my early days of ThreadRipper
1950X use (years ago now). The "options MAXMEMDOM=2"
had been in place since back then, with no known
contributions to any problems. (Until recently the
1950X was the only amd64 context. Now there is also
a Ryzen 9 7950X3D.)

Recently, I've been testing a potential patch to UFS
associated with changing from:

#define UFS_LINK_MAX 32767

to, say,

#define UFS_LINK_MAX 65530

I had discovered that "bulk -a" was broken on amd64 for
UFS, and was so via hitting UFS_LINK_MAX for how the
associated directories trees are currently structured.
I'm not the author of the UFS patch.

The "options MAXMEMDOM=2" vs. not status changed the
behavior of my UFS "bulk -a" runs, not just the messages
reporting: "kernel: Process (pid 1) got signal 5".

With MAXMEMDOM=2 I was getting occasional random port
build failures with the non-debug kernel for a UFS context.
(The original discovery was in a non-debug kernel/world
context.) I'd not gotten to 2 hr into a "bulk -a" without
getting such a random port build failure. By contrast, my
normal non-debug kernel ZFS context did not get such random
port build errors during "bulk -a". (I've done multiple
from-scratch "bulk -a" builds from the same, non-updated
/usr/ports tree [same by content].)

Without MAXMEMDOM=2 I am not getting the random port build
failures in any context that I've tried. So far as I can
tell, this "Without MAXMEMDOM=2" type of context is simply
working correctly. A from-scratch "bulk -a" worked for a
non-debug kernel used with UFS, no evidence of any problems,
the port build failures being ones that happen in other
contexts as well. For reference:

[main-amd64-bulk_a-default] [2023-11-21_22h14m34s] [committing:] Queued: 34683 Built: 33843 Failed: 163   Skipped: 357   Ignored: 320   Fetched: 0     Tobuild: 0      Time: 33:17:31
[33:17:32] Logs: /usr/local/poudriere/data/logs/bulk/main-amd64-bulk_a-default/2023-11-21_22h14m34s


===
Mark Millard
marklmi at yahoo.com