[Bug 280038] -Stable 14.1 on ARM compiler failure not seen in 14.1-RELEASE Pi3

From: <bugzilla-noreply_at_freebsd.org>
Date: Fri, 28 Jun 2024 16:59:01 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=280038

            Bug ID: 280038
           Summary: -Stable 14.1 on ARM compiler failure not seen in
                    14.1-RELEASE Pi3
           Product: Base System
           Version: 14.1-STABLE
          Hardware: arm64
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: arm
          Assignee: freebsd-arm@FreeBSD.org
          Reporter: karl@denninger.net

This is a rather odd problem and I'm uncertain of the scope.

Context: Pi3, checked multiple physical devices including one of the "newest"
ones with connections for POE HATs as well as an older unit with identical
results.  On a Pi4 4Gb, booting the SAME SD card there is no problem.

The package that fails is code I've had running for quite some time, and on
13.x-STABLE it works perfectly well.  I build using Crochet; -RELEASE, however,
was checked both with my own worktree for releng/14.1 and stable/14.

The exact place in the source (which function it is compiling at the time) the
compiler blows up varies to some degree but the crash is the same in all
instances.  Whether I have the source on a UFS+Su/J filesystem on the SD card
or I copy it to a tempfs (Ramdisk) doesn't matter so I surmise this is
something that has changed either in the kernel or clang -- and it may be
thread related.

That it never occurs on the Pi4 is troublesome as that implies its local to
either the CPU on the "3" or its RAM architecture .vs. the 4, given that I am
literally plugging the same SD card into each.  There is no evidence of RAM
exhaustion or similar.

Here's an example of the crash; I am running this on the physical (serial)
console so if there was a kernel complaint about memory or similar it would be
embedded in this output.

First, the boot message from "dmesg" on the Pi3 (some elided but including the
ARM CPU info):
---<<BOOT>>---
WARNING: Cannot find freebsd,dts-version property, cannot check DTB compliance
Copyright (c) 1992-2023 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 14.1-STABLE stable/14-n268036-9a53391b601d GENERIC arm64
FreeBSD clang version 18.1.6 (https://github.com/llvm/llvm-project.git
llvmorg-18.1.6-0-g1118c2e05e67)
VT(efifb): resolution 656x416
module scmi already present!
real memory  = 994041856 (947 MB)
avail memory = 945451008 (901 MB)
Starting CPU 1 (1)
Starting CPU 2 (2)
Starting CPU 3 (3)
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
arc4random: WARNING: initial seeding bypassed the cryptographic random device
because it was not yet seeded and the knob 'bypass_before_seeding' was enabled.
random: entropy device external interface
kbd0 at kbdmux0
ofwbus0: <Open Firmware Device Tree>
simplebus0: <Flattened device tree simple bus> on ofwbus0
ofw_clkbus0: <OFW clocks bus> on ofwbus0
regfix0: <Fixed Regulator> on ofwbus0
clk_fixed2: clock-fixed has no clock-frequency
....
CPU  0: ARM Cortex-A53 r0p4 affinity:  0
                   Cache Type = <64 byte D-cacheline,64 byte I-cacheline,VIPT
ICache,64 byte ERG,64 byte CWG>
 Instruction Set Attributes 0 = <CRC32>
 Instruction Set Attributes 1 = <>
 Instruction Set Attributes 2 = <>
         Processor Features 0 = <AdvSIMD,FP,EL3 32,EL2 32,EL1 32,EL0 32>
         Processor Features 1 = <>
      Memory Model Features 0 = <TGran4,TGran64,SNSMem,BigEnd,16bit ASID,1TB
PA>
Trying to mount root from ufs:/dev/mmcsd0s2a [ro]...
      Memory Model Features 1 = <8bit VMID>
      Memory Model Features 2 = <32bit CCIDX,48bit VA>
             Debug Features 0 = <DoubleLock,2 CTX BKPTs,4 Watchpoints,6
Breakpoints,PMUv3,Debugv8>
             Debug Features 1 = <>
         Auxiliary Features 0 = <>
         Auxiliary Features 1 = <>
AArch32 Instruction Set Attributes 5 = <CRC32,SEVL>
AArch32 Media and VFP Features 0 = <FPRound,FPSqrt,FPDivide,DP VFPv3+v4,SP
VFPv3+v4,AdvSIMD>
AArch32 Media and VFP Features 1 = <SIMDFMAC,FPHP DP Conv,SIMDHP SP
Conv,SIMDSP,SIMDInt,SIMDLS,FPDNaN,FPFtZ>
CPU  1: ARM Cortex-A53 r0p4 affinity:  1
CPU  2: ARM Cortex-A53 r0p4 affinity:  2
CPU  3: ARM Cortex-A53 r0p4 affinity:  3
Release APs...done

And then....

root@rpi:/data/karl/HD-MCP # make clean
rm -f *.o hd-mcp hd-mcp.freeware license-server hd-commit
root@rpi:/data/karl/HD-MCP # make
cc  -g -Wstrict-prototypes -DVERSION=\"8.0.0-LocalAuth\" -c config.c -o
config.o
cc  -g -Wstrict-prototypes -DVERSION=\"8.0.0-LocalAuth\" -c funcs.c -o funcs.o
cc  -g -Wstrict-prototypes -DVERSION=\"8.0.0-LocalAuth\" -c hd-mcp.c -o
hd-mcp.o
PLEASE submit a bug report to https://bugs.freebsd.org/submit/ and include the
crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: cc -g -Wstrict-prototypes
-DVERSION=\"8.0.0-LocalAuth\" -c hd-mcp.c -o hd-mcp.o
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'hd-mcp.c'.
4.      Running pass 'AArch64O0PreLegalizerCombiner' on function
'@process_unit_get_response'
#0 0x0000000004b17588 (/usr/bin/cc+0x4b17588)
#1 0x0000000004b15650 (/usr/bin/cc+0x4b15650)
#2 0x0000000004ae16a0 (/usr/bin/cc+0x4ae16a0)
#3 0x000000008a02eeb8 (/lib/libthr.so.3+0x2aeb8)
cc: error: clang frontend command failed with exit code 139 (use -v to see
invocation)
FreeBSD clang version 18.1.6 (https://github.com/llvm/llvm-project.git
llvmorg-18.1.6-0-g1118c2e05e67)
Target: aarch64-unknown-freebsd14.1
Thread model: posix
InstalledDir: /usr/bin
cc: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
cc: note: diagnostic msg: /tmp/hd-mcp-720fdc.c
cc: note: diagnostic msg: /tmp/hd-mcp-720fdc.sh
cc: note: diagnostic msg:

********************
*** Error code 1

Stop.
make: stopped in /data/karl/HD-MCP
root@rpi:/data/karl/HD-MCP #

The crash is always in libthr.so.3 and at that address; SOMETIMES if I
re-execute the "make" command it will get past this file but then blows up in
another one:

root@rpi:/data/karl/HD-MCP # make
cc  -g -Wstrict-prototypes -DVERSION=\"8.0.0-LocalAuth\" -c hd-mcp.c -o
hd-mcp.o
cc  -g -Wstrict-prototypes -DVERSION=\"8.0.0-LocalAuth\" -c www.c -o www.o
cc  -g -Wstrict-prototypes -DVERSION=\"8.0.0-LocalAuth\" -c slave.c -o slave.o
cc  -g -Wstrict-prototypes -DVERSION=\"8.0.0-LocalAuth\" -c amcrest.c -o
amcrest.o
cc  -g -Wstrict-prototypes -DVERSION=\"8.0.0-LocalAuth\" -c license.c -o
license.o
cc  -g -Wstrict-prototypes -DVERSION=\"8.0.0-LocalAuth\" -c z-wave.c -o
z-wave.o
cc  -g -Wstrict-prototypes -DVERSION=\"8.0.0-LocalAuth\" -c malloc.c -o
malloc.o
cc  -g -Wstrict-prototypes -DVERSION=\"8.0.0-LocalAuth\" -c S0-encryption.c -o
S0-encryption.o
PLEASE submit a bug report to https://bugs.freebsd.org/submit/ and include the
crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: cc -g -Wstrict-prototypes
-DVERSION=\"8.0.0-LocalAuth\" -c S0-encryption.c -o S0-encryption.o
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'S0-encryption.c'.
4.      Running pass 'AArch64O0PreLegalizerCombiner' on function
'@generate_mac'
#0 0x0000000004b17588 (/usr/bin/cc+0x4b17588)
#1 0x0000000004b15650 (/usr/bin/cc+0x4b15650)
#2 0x0000000004ae16a0 (/usr/bin/cc+0x4ae16a0)
#3 0x000000008a39feb8 (/lib/libthr.so.3+0x2aeb8)
cc: error: clang frontend command failed with exit code 139 (use -v to see
invocation)
FreeBSD clang version 18.1.6 (https://github.com/llvm/llvm-project.git
llvmorg-18.1.6-0-g1118c2e05e67)
Target: aarch64-unknown-freebsd14.1
Thread model: posix
InstalledDir: /usr/bin
cc: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
cc: note: diagnostic msg: /tmp/S0-encryption-2a4545.c
cc: note: diagnostic msg: /tmp/S0-encryption-2a4545.sh
cc: note: diagnostic msg:

********************
*** Error code 1

Stop.
make: stopped in /data/karl/HD-MCP

That file almost NEVER completes -- but once in a great while it will (!!) and
when it does the executable that gets produced runs as expected.

The crash in the compiler, when it occurs, always has the same traceback to the
same place in libthr.so.3 irrespective of which function is the one referenced
in the compiler crash itself

Clearing the object directory and re-running the build does not change the
outcome.  But if I build releng/14.1 (same Crochet, just changing the source to
releng/14.1 from stable/14)  and boot THAT, it never crashes.  It also never
crashes during build on *either* version of the OS if I am running on a Pi4 --
only on the 3.

S0-encryption.c is an unremarkable file that contains a handful of functions
that are all related to the use of OpenSSL routines to perform encryption,
decryption along with computing (and checking) a MAC against data packets; it
is only some 400 lines of C code.

If the crash disappears on future updates to stable/14 I'll withdraw it as OBE,
but since this implies there's a potential problem with thread handling on the
Pi3 under 14/stable I wanted to stick it out there.

-- 
You are receiving this mail because:
You are the assignee for the bug.