Re: Still seeing Failed assertion: "p[i] == 0" on armv7 buildworld

Reply: bob prohaska : "Re: Still seeing Failed assertion: "p[i] == 0" on armv7 buildworld"
In reply to: bob prohaska : "Re: Still seeing Failed assertion: "p[i] == 0" on armv7 buildworld"
Go to: [ bottom of page ] [ top of archives ] [ this month ]
From: Mark Millard <marklmi_at_yahoo.com>
Date: Fri, 14 Nov 2025 17:16:57 UTC
On Nov 14, 2025, at 07:25, bob prohaska <fbsd@www.zefox.net> wrote:

> On Thu, Nov 13, 2025 at 11:16:56PM -0800, Carl Shapiro wrote:
>> bob prohaska <fbsd@www.zefox.net> writes:
>> 
>>> All the assertion failures I've seen have been in the clang libraries during
>>> buildworld. They appear to happen in a variety of cases, indicated by the 
>>> different .sh and .cpp filenames found in the files under
>>> http://www.zefox.net/~fbsd/assertion_failure/
>> 
>> Do you have the stdout and stderr of the build somewhere in there as
>> well?  The make(1) invocation in the readme file shows its output being
>> redirected to a file.
> 
> Those files have been overwritten by restarting the buildworld sessions.
> They tend to be large and diffcult to synchronize with the .cpp and .sh
> files generated by the crash. It could be done if it's useful.
> 
>> 
>> The assert you mentioned in the subject of your e-mail message, which I
>> also saw in the readme file, could come from jemalloc.  See these lines
>> of code for the context
>> 
>> https://github.com/facebook/jemalloc/blob/dev/src/extent.c#L805-L814
>> 
>> That assertion will be tripped when jemalloc sees non-zero memory that
>> it expects to be zeroed.  See for example
>> 
>> https://github.com/facebook/jemalloc/blob/dev/src/pages.c#L55-L106
>> 
>> Looking at the code, my hypothesis would be that jemalloc thinks it's
>> committing memory for the first time but the memory is coming back with
>> non-zero data.
>> 
>> Just curious, but is over-commit enabled on your system?  Here is the
>> signal jemalloc is using to check
>> 
>> https://github.com/facebook/jemalloc/blob/dev/src/pages.c#L729-L737
>> 
> 
> Sysctl -a reports in part:
> # sysctl -a | grep -i overcommit
> sysctl: S_vmtotal 48 != 88

The s_vmtotal line above is from what

sysctl vm.vmtotal

would report: output for

"System wide totals computed every five seconds".

That S_vmtotal line reported is a internal warning from
sysctl. The 88 is correct and is sizeof(struct vmtotal)
from sys/sys/vmmeter.h :

(kgdb) ptype /o *(struct vmtotal*)0
/* offset      |    size */  type = struct vmtotal {
/*      0      |       8 */    uint64_t t_vm;
/*      8      |       8 */    uint64_t t_avm;
/*     16      |       8 */    uint64_t t_rm;
/*     24      |       8 */    uint64_t t_arm;
/*     32      |       8 */    uint64_t t_vmshr;
/*     40      |       8 */    uint64_t t_avmshr;
/*     48      |       8 */    uint64_t t_rmshr;
/*     56      |       8 */    uint64_t t_armshr;
/*     64      |       8 */    uint64_t t_free;
/*     72      |       2 */    int16_t t_rq;
/*     74      |       2 */    int16_t t_dw;
/*     76      |       2 */    int16_t t_pw;
/*     78      |       2 */    int16_t t_sl;
/*     80      |       2 */    int16_t t_sw;
/*     82      |       6 */    uint16_t t_pad[3];

                               /* total size (bytes):   88 */
                             }

The 48 is wrong for what the internal sysctl(. . .)
returned. The message also indicates that the
normal assocaited output was not generated for
vm.vmtotal .

I do not know if the error is somehow associated with
your overlarge swap space (if you still have that).
In my context "sysctl vm.vmtotal" and "sysctl -a"
are working normally.


> vm.overcommit: 0

"man 7 tuning" reports about vm.overcommit :

     The vm.overcommit sysctl defines the overcommit behaviour of the vm
     subsystem.  The virtual memory system always does accounting of the swap
     space reservation, both total for system and per-user.  Corresponding
     values are available through sysctl vm.swap_total, that gives the total
     bytes available for swapping, and vm.swap_reserved, that gives number of
     bytes that may be needed to back all currently allocated anonymous
     memory.

     Setting bit 0 of the vm.overcommit sysctl causes the virtual memory
     system to return failure to the process when allocation of memory causes
     vm.swap_reserved to exceed vm.swap_total.  Bit 1 of the sysctl enforces
     RLIMIT_SWAP limit (see getrlimit(2)).  Root is exempt from this limit.
     Bit 2 allows to count most of the physical memory as allocatable, except
     wired and free reserved pages (accounted by vm.stats.vm.v_free_target and
     vm.stats.vm.v_wire_count sysctls, respectively).

> # 
> It's unclear if this implies yes or no, or even is the correct test.
> 
>>> The failures are random in the sense that restarting buildworld either
>>> produces a new assertion failure in a different library or completion.
>>> 
>>> It isn't obvious how to capture a stack trace, if you can provide guidance
>>> I'll give it a try. As is, buildworld simply stops, the machine does not
>>> crash.
>> 
>> It might be captured for you already?  I noticed files with names
>> containing "symbolizer-input" and "symbolizer-ouput" like this one
>> 
>> http://www.zefox.net/~fbsd/assertion_failure/hostname_pelorus.zefox.org/symbolizer-output-7282d9
>> 
>> and the output files contain a stack trace like this
>> 
>>  llvm::sys::PrintStackTrace(llvm::raw_ostream&, int)
>>  /usr/src/contrib/llvm-project/llvm/lib/Support/Unix/Signals.inc:731:7
>> 
>>  llvm::sys::RunSignalHandlers()
>>  /usr/src/contrib/llvm-project/llvm/lib/Support/Signals.cpp:0:5
>> 
>>  SignalHandler
>>  /usr/src/contrib/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
>> 
>>  handle_signal
>>  /usr/src/lib/libthr/thread/thr_sig.c:0:3
>> 
>> Any idea who or what is creating those files and when?
> 
> The files are deposited in /tmp, apparently by the C compiler as records
> of an internal error in the compiler, usually number 134. My understanding 
> is superficial at best.  



===
Mark Millard
marklmi at yahoo.com