Re: performance regressions in 15.0

From: Karl Denninger <karl_at_denninger.net>
Date: Sun, 07 Dec 2025 13:26:47 UTC
On 12/6/2025 22:03, Mark Millard wrote:
> On Dec 6, 2025, at 14:25, Warner Losh<imp@bsdimp.com> wrote:
>
>> On Sat, Dec 6, 2025, 3:06 PM Mark Millard<marklmi@yahoo.com> wrote:
>>
>>> On Dec 6, 2025, at 06:14, Mark Millard<marklmi@yahoo.com> wrote:
>>>
>>>> Mateusz Guzik <mjguzik_at_gmail.com> wrote on
>>>> Date: Sat, 06 Dec 2025 10:50:08 UTC :
>>>>
>>>>> I got pointed at phoronix:https://www.phoronix.com/review/freebsd-15-amd-epyc
>>>>>
>>>>> While I don't treat their results as gospel, a FreeBSD vs FreeBSD test
>>>>> showing a slowdown most definitely warrants a closer look.
>>>>>
>>>>> They observed slowdowns when using iperf over localhost and when compiling llvm.
>>>>>
>>>>> I can confirm both problems and more.
>>>>>
>>>>> I found the profiling tooling for userspace to be broken again so I
>>>>> did not investigate much and I'm not going to dig into it further.
>>>>>
>>>>> Test box is AMD EPYC 9454 48-Core Processor, with the 2 systems
>>>>> running as 8 core vms under kvm.
>>>>> . . .

A note on jemalloc-5.3.0.

I have a relatively-complex "persistent" (FastCGI) application that 
makes a large number of dynamic RAM allocations of various sizes from 
modest to quite-large (many megabytes) and then, at the end of each pass 
through for a particular connection, releases most of the 
possibly-larger ones.  It also, of course, has buffers I don't directly 
control (such as those in the database connection library for Postgres 
that is linked with it.)

It had a rather-annoying habit of sometimes growing the allocated field 
during certain access patterns I was not able to trace conclusively, but 
with a lot of debugging and profiling code I was utterly convinced that 
I was not leaving anything dangling (even going so far as to wrap all 
malloc and free calls and keep a separate list of them with each pass 
through the code validating that in fact said list was empty when the 
pass completed.)  It appeared that what was happening was that the 
allocator was failing to reuse available-but-not-in-use space under 
certain conditions.

I finally grabbed and compiled up jemalloc-5.3.0 and linked with that 
instead of the stock library /and the behavior disappeared. /That code 
now runs in a very stable, predictable and consistent RSS/VSZ range with 
no changes to the actual code itself whatsoever -- only the change in 
the jemalloc version is involved.  I've now got many months of stable 
operation post making this change.

I haven't noted performance problems of any sort with 5.3.0 but that 
this version appears to better-organize the arena such that I don't get 
effectively-unbounded growth in it over long periods of execution, 
despite not leaking malloc() calls which are not free()d when using it 
is very clear.  This specific code is running on 14.3-RELEASE-p5 at the 
present time.

-- 
Karl Denninger
karl@denninger.net
/The Market Ticker/
/[S/MIME encrypted email preferred]/