FYI: ^T use during poudriere bulk vs. /bin/sh operation: I got a "Unsafe ckmalloc() call" Abort trap that left a mess

From: Mark Millard <marklmi_at_yahoo.com>
Date: Tue, 05 Sep 2023 00:16:56 UTC
During a (zfs based) poudriere bulk -a run a ^T got a:

Unsafe ckmalloc() call
Abort trap (core dumped)

from /bin/sh . AT that point it had
built something like 350 ports into
packages in .building/ . After the
last finish was reported for the
already building ports, I killed 2
stuck poudriere related processes
(the rest were gone already) and
then I used:

# poudriere jail -jmain-amd64-bulk_a -k
[00:06:03] Unmounting file systems
umount: unmount of /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/usr/tests failed: Device busy
umount: unmount of /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/usr/share failed: Device busy
umount: unmount of /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/usr/lib32 failed: Device busy
umount: unmount of /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/rescue failed: Device busy
umount: unmount of /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/usr/tests failed: Device busy
umount: unmount of /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/usr/share failed: Device busy
umount: unmount of /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/usr/lib32 failed: Device busy
umount: unmount of /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/rescue failed: Device busy
chflags: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/rescue/rcorder: Read-only file system
chflags: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/rescue/mdconfig: Read-only file system
chflags: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/rescue/mdmfs: Read-only file system
. . .
chflags: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/usr/tests/lib: Read-only file system
chflags: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/usr/tests: Read-only file system
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/rescue: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/usr/lib32: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/usr/share: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/usr/tests: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/usr: Directory not empty
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref: Directory not empty
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/dev: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/var/db/ports: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/var/db: Directory not empty
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/var/empty: Operation not permitted
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/var: Directory not empty
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/rescue: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/sbin/init: Operation not permitted
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/sbin: Directory not empty
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/libexec/ld-elf.so.1: Operation not permitted
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/libexec/ld-elf32.so.1: Operation not permitted
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/libexec: Directory not empty
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/packages: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/proc: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/distfiles: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/compat/linux/proc: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/compat/linux: Directory not empty
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/compat: Directory not empty
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/lib/libthr.so.3: Operation not permitted
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/lib/libc.so.7: Operation not permitted
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/lib/libcrypt.so.5: Operation not permitted
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/lib: Directory not empty
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/usr/src: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/usr/bin/crontab: Operation not permitted
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/usr/bin/passwd: Operation not permitted
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/usr/bin/chpass: Operation not permitted
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/usr/bin/su: Operation not permitted
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/usr/bin/login: Operation not permitted
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/usr/bin: Directory not empty
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/usr/tests: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/usr/ports: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/usr/lib/librt.so.1: Operation not permitted
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/usr/lib: Directory not empty
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/usr/lib32: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/usr/share: Device busy
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40/usr: Directory not empty
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../40: Directory not empty
. . .
rm: /usr/local/poudriere/data/.m/main-amd64-bulk_a-default/ref/../: Invalid argument


After that "df -m" shows lots of "Mounted on"s matching the
patterns:

/usr/local/poudriere/data/.m/main-amd64-bulk_a-default/*/rescue
/usr/local/poudriere/data/.m/main-amd64-bulk_a-default/*/usr/lib32
/usr/local/poudriere/data/.m/main-amd64-bulk_a-default/*/usr/share
/usr/local/poudriere/data/.m/main-amd64-bulk_a-default/*/usr/tests
/usr/local/poudriere/data/.m/main-amd64-bulk_a-default/*/var/db/ports
/usr/local/poudriere/data/.m/main-amd64-bulk_a-default/*/packages
/usr/local/poudriere/data/.m/main-amd64-bulk_a-default/*/usr/src
/usr/local/poudriere/data/.m/main-amd64-bulk_a-default/*/usr/ports
/usr/local/poudriere/data/.m/main-amd64-bulk_a-default/*/distfiles
/usr/local/poudriere/data/.m/main-amd64-bulk_a-default/*/dev
/usr/local/poudriere/data/.m/main-amd64-bulk_a-default/*/dev/fd
/usr/local/poudriere/data/.m/main-amd64-bulk_a-default/*/compat/linux/proc
/usr/local/poudriere/data/.m/main-amd64-bulk_a-default/*/proc

No clue if I'll ever reproduce it.

The odd part of the context was I was testing bulk -J128
(no ALLOW_MAKE_JOBS use) on a ThreadRipper 1950X, so 32
hardware threads. This was part of attempting to see if I
get any examples of deadlocks or corruptions or such that
some earlier imports of openzfs had produced for others.

For reference:

# uname -apKU
FreeBSD amd64-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT amd64 1500000 #118 main-n265152-f49d6f583e9d-dirty: Mon Sep  4 14:26:56 PDT 2023     root@amd64_ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG amd64 amd64 1500000 1500000

My attribution to ^T handling is unverified: I did not find the
sh.core file. It is just what the timing looked like.


===
Mark Millard
marklmi at yahoo.com