jemalloc 5.3.0 upgrade

From: Warner Losh <imp_at_bsdimp.com>
Date: Fri, 15 Aug 2025 21:56:05 UTC
After much delay, I've landed jemalloc 5.3.0 into main.

This is likely the last update of jemalloc since the upstream is, at best,
in turmoil, and at worst dead.

I tried to completely automate all the details of the upgrade, but only got
so far. I did the rest of the upgrade by hand (described in
FREEBSD-upgrade). I'd held off landing this until I had that, but once it
was clear this was likely the last time we'd need this, I just did the last
few steps by hand. I did this to make it easier to audit to ensure that the
pull request we got for this (which I redid, but compared to the original)
didn't sneak something in. Others can audit me as well.

I've run this with a netflix workload and my developer workload with no
regressions.

Please let me know if this causes problems for anybody. I'm sure glad I'll
not have to rebase the merge again (it was a pathological case for the
instructions in the handbook, so I'll update those).

I've been coordinating this with the release engineer for a while now, who
gave me a go ahead for landing this during the freeze since I couldn't
finish before my vacation last month...

Warner

P.S. Here's the release notes:
+* 5.3.0 (May 6, 2022)
+
+  This release contains many speed and space optimizations, from micro
+  optimizations on common paths to rework of internal data structures and
+  locking schemes, and many more too detailed to list below.  Multiple
percent
+  of system level metric improvements were measured in tested production
+  workloads.  The release has gone through large-scale production testing.
+
+  New features:
+  - Add the thread.idle mallctl which hints that the calling thread will be
+    idle for a nontrivial period of time.  (@davidtgoldblatt)
+  - Allow small size classes to be the maximum size class to cache in the
+    thread-specific cache, through the opt.[lg_]tcache_max option.
 (@interwq,
+    @jordalgo)
+  - Make the behavior of realloc(ptr, 0) configurable with
opt.zero_realloc.
+    (@davidtgoldblatt)
+  - Add 'make uninstall' support.  (@sangshuduo, @Lapenkov)
+  - Support C++17 over-aligned allocation.  (@marksantaniello)
+  - Add the thread.peak mallctl for approximate per-thread peak memory
tracking.
+    (@davidtgoldblatt)
+  - Add interval-based stats output opt.stats_interval.  (@interwq)
+  - Add prof.prefix to override filename prefixes for dumps.  (@zhxchen17)
+  - Add high resolution timestamp support for profiling.  (@tyroguru)
+  - Add the --collapsed flag to jeprof for flamegraph generation.
+    (@igorwwwwwwwwwwwwwwwwwwww)
+  - Add the --debug-syms-by-id option to jeprof for debug symbols
discovery.
+    (@DeannaGelbart)
+  - Add the opt.prof_leak_error option to exit with error code when leak is
+    detected using opt.prof_final.  (@yunxuo)
+  - Add opt.cache_oblivious as an runtime alternative to
config.cache_oblivious.
+    (@interwq)
+  - Add mallctl interfaces:
+    + opt.zero_realloc  (@davidtgoldblatt)
+    + opt.cache_oblivious  (@interwq)
+    + opt.prof_leak_error  (@yunxuo)
+    + opt.stats_interval  (@interwq)
+    + opt.stats_interval_opts  (@interwq)
+    + opt.tcache_max  (@interwq)
+    + opt.trust_madvise  (@azat)
+    + prof.prefix  (@zhxchen17)
+    + stats.zero_reallocs  (@davidtgoldblatt)
+    + thread.idle  (@davidtgoldblatt)
+    + thread.peak.{read,reset}  (@davidtgoldblatt)
+
+  Bug fixes:
+  - Fix the synchronization around explicit tcache creation which could
cause
+    invalid tcache identifiers.  This regression was first released in
5.0.0.
+    (@yoshinorim, @davidtgoldblatt)
+  - Fix a profiling biasing issue which could cause incorrect heap usage
and
+    object counts.  This issue existed in all previous releases with the
heap
+    profiling feature.  (@davidtgoldblatt)
+  - Fix the order of stats counter updating on large realloc which could
cause
+    failed assertions.  This regression was first released in 5.0.0.
 (@azat)
+  - Fix the locking on the arena destroy mallctl, which could cause
concurrent
+    arena creations to fail.  This functionality was first introduced in
5.0.0.
+    (@interwq)
+
+  Portability improvements:
+  - Remove nothrow from system function declarations on macOS and FreeBSD.
+    (@davidtgoldblatt, @fredemmott, @leres)
+  - Improve overcommit and page alignment settings on NetBSD.  (@zoulasc)
+  - Improve CPU affinity support on BSD platforms.  (@devnexen)
+  - Improve utrace detection and support.  (@devnexen)
+  - Improve QEMU support with MADV_DONTNEED zeroed pages detection.
 (@azat)
+  - Add memcntl support on Solaris / illumos.  (@devnexen)
+  - Improve CPU_SPINWAIT on ARM.  (@AWSjswinney)
+  - Improve TSD cleanup on FreeBSD.  (@Lapenkov)
+  - Disable percpu_arena if the CPU count cannot be reliably detected.
 (@azat)
+  - Add malloc_size(3) override support.  (@devnexen)
+  - Add mmap VM_MAKE_TAG support.  (@devnexen)
+  - Add support for MADV_[NO]CORE.  (@devnexen)
+  - Add support for DragonFlyBSD.  (@devnexen)
+  - Fix the QUANTUM setting on MIPS64.  (@brooksdavis)
+  - Add the QUANTUM setting for ARC.  (@vineetgarc)
+  - Add the QUANTUM setting for LoongArch.  (@wangjl-uos)
+  - Add QNX support.  (@jqian-aurora)
+  - Avoid atexit(3) calls unless the relevant profiling features are
enabled.
+    (@BusyJay, @laiwei-rice, @interwq)
+  - Fix unknown option detection when using Clang.  (@Lapenkov)
+  - Fix symbol conflict with musl libc.  (@georgthegreat)
+  - Add -Wimplicit-fallthrough checks.  (@nickdesaulniers)
+  - Add __forceinline support on MSVC.  (@santagada)
+  - Improve FreeBSD and Windows CI support.  (@Lapenkov)
+  - Add CI support for PPC64LE architecture.  (@ezeeyahoo)
+
+  Incompatible changes:
+  - Maximum size class allowed in tcache (opt.[lg_]tcache_max) now has an
upper
+    bound of 8MiB.  (@interwq)
+
+  Optimizations and refactors (@davidtgoldblatt, @Lapenkov, @interwq):
+  - Optimize the common cases of the thread cache operations.
+  - Optimize internal data structures, including RB tree and pairing heap.
+  - Optimize the internal locking on extent management.
+  - Extract and refactor the internal page allocator and interface modules.
+
+  Documentation:
+  - Fix doc build with --with-install-suffix.  (@lawmurray, @interwq)
+  - Add PROFILING_INTERNALS.md.  (@davidtgoldblatt)
+  - Ensure the proper order of doc building and installation.  (@Mingli-Yu)