From nobody Sat Dec 06 10:50:08 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4dNlQ91RWSz6KW8t for ; Sat, 06 Dec 2025 10:50:25 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4dNlQ82ZqNz3NDy for ; Sat, 06 Dec 2025 10:50:24 +0000 (UTC) (envelope-from mjguzik@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b=BNEFuNqh; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of mjguzik@gmail.com designates 2a00:1450:4864:20::533 as permitted sender) smtp.mailfrom=mjguzik@gmail.com Received: by mail-ed1-x533.google.com with SMTP id 4fb4d7f45d1cf-640860f97b5so4257254a12.2 for ; Sat, 06 Dec 2025 02:50:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765018221; x=1765623021; darn=freebsd.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=tvOcca6r9fnVrdUfQc7NVuefTaIiKm7kbdMwBe09Xlg=; b=BNEFuNqhkK3zWtUYajaPYlb4JG79UzcEU0kG8mBrr+NUT8lWk/v8gzFhNsxFh68ie4 S4sKNUmoR4UqWcoKC/3NYDhES2KUVUhKQaQb3we4nZMWwH0Vy990Xo7Yoigwgx7zsy2N 0J5mcxqqlIRHwVPWjmLoutR3VeYUrygM0jHTryM3UqKvqQjGPlFbcAAhp0f3NZ0Hm+CD CFiJbOwaC+2jClsrouuzwqyY4Z7Th7ApdDJSi6fmw2ZZfv6xBgtYtDuDmlQcGmZO0Mff sVg5waf8X5kPnwDYPzXD6JIull4bDDu0CT9vly2T/TbRt+X5PAHBR9rymDuXb/ItGzcx 3gdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765018221; x=1765623021; h=to:subject:message-id:date:from:mime-version:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tvOcca6r9fnVrdUfQc7NVuefTaIiKm7kbdMwBe09Xlg=; b=NQd4Me1QY9cb9Ycj6dDUrVEsVYgt2SiYCk7FSdB3Od9aa87SClFCVpXrkGJm/R8OFO a/MOssxwAOjozKs2wzaZ4cYQ0fzclUsTdWutRAEidflJ9QB+D11ZWw2L+zJzHVLOEXEy D7PKj3unOPsvrNqgOPDKFCVPGSx4aebt+HruYWVFvbzueB4yE9T2dBNFHM/7iFd+e7r5 Fr8YdtvJ/wA0rmlMUraXmy7rwL8hB3rfvrNtZRuT9A8VOZtHU9BY+wCp+WRwvZSrOMNf 7tkpSfezwNeRKodiLCLDTCRj2M9CW712kwAL41w9XbVB9sKoqpKObMpsBb2kARRSqSAd 09TA== X-Gm-Message-State: AOJu0Yw625v460oUKO4IXMMfam60h9jPcTDmSyrm+p6OQ00vGBYqSQpw yZZezUiEHKxQQjh5kiOS2EFIF6hpMQKKMizr22jgrD4hrcfOqt6Y9RbSReNxrf9xdt2+kYB3Xlh Gu3GAuyU0ez6M5vM1HWdHWrrow5Ra7R1toY+1 X-Gm-Gg: ASbGnctPNXec7gyycEBCxnQHoFdUfGtmEwHwQdhN/Gs2BV2M3jK6wh3H5mSdfOsdGxn hM/dhq/VMsf0vwT+Ku668csfw8hmS+FevIWPWQnuxAmAk4ZqFCRSSfD34UcoCOpWSe2VPo0hYeJ EtZbJk6nltx1FWtLyX7S5HSzAYFyazHeEqH+lS2mWSW+7Dr42ITjS7FJGmLPzLJEUeflsqeCIJi Cowy+WE5inaiDn8tWqtcT+/h0VI+er1jA3Abe8omVbY/bSegpoCS49R6LdteWmb00zouid7biPT Mq0hGsPPYqyAVFTxjK5lrKvDRw== X-Google-Smtp-Source: AGHT+IFsyTdX29m7Ry6VA3ZhlmNRl/e0yDwitNnAy+vwbDJRwpvEBCm0tLBAr6Jyqny7A7d6KGXjEMBYzP3vPhb2bJI= X-Received: by 2002:a05:6402:1468:b0:640:b643:f3c5 with SMTP id 4fb4d7f45d1cf-6491abf273emr1425940a12.16.1765018220847; Sat, 06 Dec 2025 02:50:20 -0800 (PST) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 From: Mateusz Guzik Date: Sat, 6 Dec 2025 11:50:08 +0100 X-Gm-Features: AQt7F2oGz_tIGGZ62pQJ3zS_YmC1eAAcJ5s5SGmeGQ1RQZaDxXR-nP4_gANSDGY Message-ID: Subject: performance regressions in 15.0 To: FreeBSD Current Content-Type: text/plain; charset="UTF-8" X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.92 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.92)[-0.922]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; MIME_GOOD(-0.10)[text/plain]; RCPT_COUNT_ONE(0.00)[1]; MISSING_XM_UA(0.00)[]; RCVD_COUNT_ONE(0.00)[1]; FREEMAIL_FROM(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::533:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ARC_NA(0.00)[]; RCVD_TLS_LAST(0.00)[]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TO_DN_ALL(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; MID_RHS_MATCH_FROMTLD(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+] X-Rspamd-Queue-Id: 4dNlQ82ZqNz3NDy I got pointed at phoronix: https://www.phoronix.com/review/freebsd-15-amd-epyc While I don't treat their results as gospel, a FreeBSD vs FreeBSD test showing a slowdown most definitely warrants a closer look. They observed slowdowns when using iperf over localhost and when compiling llvm. I can confirm both problems and more. I found the profiling tooling for userspace to be broken again so I did not investigate much and I'm not going to dig into it further. Test box is AMD EPYC 9454 48-Core Processor, with the 2 systems running as 8 core vms under kvm. I. iperf Package is: iperf3-3.19.1 Tested with: iperf3 -s + iperf3 -c localhost While the rates fluctuate, 14.3 is overall faster: [ ID] Interval Transfer Bitrate [ 5] 0.00-1.01 sec 2.70 GBytes 23.1 Gbits/sec [ 5] 1.01-2.07 sec 1.92 GBytes 15.5 Gbits/sec [ 5] 2.07-3.01 sec 1.76 GBytes 16.1 Gbits/sec [ 5] 3.01-4.02 sec 1.86 GBytes 15.9 Gbits/sec [ 5] 4.02-5.01 sec 2.84 GBytes 24.5 Gbits/sec [ 5] 5.01-6.02 sec 2.54 GBytes 21.7 Gbits/sec [ 5] 6.02-7.07 sec 2.18 GBytes 17.8 Gbits/sec [ 5] 7.07-8.02 sec 1.76 GBytes 15.9 Gbits/sec [ 5] 8.02-9.01 sec 1.88 GBytes 16.3 Gbits/sec [ 5] 9.01-10.02 sec 1.90 GBytes 16.2 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate [ 5] 0.00-10.02 sec 21.3 GBytes 18.3 Gbits/sec receiver vs 15.0: [ ID] Interval Transfer Bitrate [ 5] 0.00-1.01 sec 1.85 GBytes 15.7 Gbits/sec [ 5] 1.01-2.02 sec 3.23 GBytes 27.5 Gbits/sec [ 5] 2.02-3.03 sec 1.84 GBytes 15.7 Gbits/sec [ 5] 3.03-4.01 sec 1.86 GBytes 16.3 Gbits/sec [ 5] 4.01-5.01 sec 1.64 GBytes 14.1 Gbits/sec [ 5] 5.01-6.07 sec 1.87 GBytes 15.1 Gbits/sec [ 5] 6.07-7.01 sec 1.23 GBytes 11.3 Gbits/sec [ 5] 7.01-8.01 sec 1.85 GBytes 15.8 Gbits/sec [ 5] 8.01-9.01 sec 1.42 GBytes 12.2 Gbits/sec [ 5] 9.01-10.01 sec 1.81 GBytes 15.5 Gbits/sec [ 5] 10.01-10.07 sec 99.9 MBytes 14.1 Gbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate [ 5] 0.00-10.07 sec 18.7 GBytes 16.0 Gbits/sec receiver This is reliably repeatable. II. compilation speed The the real and serious problem. Both versions of the system ship the same clang version: FreeBSD clang version 19.1.7 (https://github.com/llvm/llvm-project.git llvmorg-19.1.7-0-gcd708029e0b2) Target: x86_64-unknown-freebsd14.3 Thread model: posix InstalledDir: /usr/bin FreeBSD clang version 19.1.7 (https://github.com/llvm/llvm-project.git llvmorg-19.1.7-0-gcd708029e0b2) Target: x86_64-unknown-freebsd15.0 Thread model: posix InstalledDir: /usr/bin I found that compiling the will-it-scale suite about doubles in real time needed, along with doubling time spent in userspace. will-it-scale needs a little bit of massaging to work, diff at the end. check this out (repeabale): while true; do gmake -s clean && time gmake -s -j 8; done 14.3: gmake -s -j 8 8.93s user 2.03s system 769% cpu 1.42s (1.424) total gmake -s -j 8 9.02s user 2.16s system 757% cpu 1.48s (1.475) total gmake -s -j 8 9.29s user 1.95s system 774% cpu 1.45s (1.450) total gmake -s -j 8 8.97s user 2.46s system 770% cpu 1.48s (1.484) total gmake -s -j 8 9.13s user 2.30s system 773% cpu 1.48s (1.477) total 15.0: gmake -s -j 8 19.90s user 3.02s system 773% cpu 2.96s (2.963) total gmake -s -j 8 19.90s user 3.18s system 774% cpu 2.98s (2.979) total gmake -s -j 8 20.24s user 2.90s system 770% cpu 3.00s (3.005) total gmake -s -j 8 19.92s user 3.25s system 771% cpu 3.00s (3.003) total gmake -s -j 8 20.25s user 2.95s system 772% cpu 3.01s (3.006) total user time *skyrocketed* This is not some weird scheduling anomaly either: while true; do gmake -s clean && time cpuset -l 1 gmake -s ; done 14.3: cpuset -l 1 gmake -s 8.88s user 1.11s system 99% cpu 10.00s (10.003) total cpuset -l 1 gmake -s 8.94s user 1.12s system 99% cpu 10.07s (10.067) total cpuset -l 1 gmake -s 9.00s user 1.06s system 99% cpu 10.07s (10.072) total cpuset -l 1 gmake -s 8.88s user 1.17s system 99% cpu 10.07s (10.069) total cpuset -l 1 gmake -s 8.88s user 1.23s system 99% cpu 10.13s (10.127) total 15.0: cpuset -l 1 gmake -s 21.58s user 2.33s system 99% cpu 23.96s (23.961) total cpuset -l 1 gmake -s 21.16s user 2.54s system 99% cpu 23.76s (23.759) total cpuset -l 1 gmake -s 19.90s user 1.90s system 99% cpu 21.85s (21.854) total cpuset -l 1 gmake -s 19.76s user 1.74s system 99% cpu 21.55s (21.554) total cpuset -l 1 gmake -s 19.72s user 1.75s system 99% cpu 21.53s (21.526) total Per my previous remark I found userspace profiling to be non-operational and I did not try to fight it. It did however do few sanity checks mostly with will-its-scale: 1. syscall rate is down over 7% (tested with getppid1_processes) 2. malloc also got a slowdown(!). there are 2 benches, one ends up issuing syscalls, the other does not. Results in ops/s: malloc1_processes (malloc/free of 128MB): 14.3: 1960769 15.0: 1376087 (-30%) malloc2_processes (malloc/free of 1kB): 14.3: 156034491 15.0: 51645759 (-67%) Apart from that the kernel is overall slower, for example negative path lookups also regressed (-12%). Another issue is execve rate. To bench that I borrowed the following: http://apollo.backplane.com/DFlyMisc/doexec.c cc -O2 doexec.c cpuset -l 1 ./a.out 1 In ops/s: 14.3: 4905 15.0: 3672 (-25%) The clang thing might happen to be clang-specific. Whatever it is, I think the total slowdown is serious enough that it warrants investigation and an errata notice. But you do you, I am *not* going to work on this. will-it-scale howto: pkg install gmake hwloc git clone https://github.com/antonblanchard/will-it-scale add this: diff --git a/Makefile b/Makefile index 8dd0717..d779705 100644 --- a/Makefile +++ b/Makefile @@ -1,9 +1,11 @@ -CFLAGS+=-Wall -O2 -g -LDFLAGS+=-lhwloc +CFLAGS+=-Wall -O2 -g -I/usr/local/include +LDFLAGS+=-lhwloc -L/usr/local/lib processes := $(patsubst tests/%.c,%_processes,$(wildcard tests/*.c)) threads := $(patsubst tests/%.c,%_threads,$(wildcard tests/*.c)) +threadspawn1_processes_FLAGS+=-lpthread + all: processes threads processes: $(processes) diff --git a/tests/malloc1.c b/tests/malloc1.c index 14d4c3b..05737bb 100644 --- a/tests/malloc1.c +++ b/tests/malloc1.c @@ -12,6 +12,7 @@ void testcase(unsigned long long *iterations, unsigned long nr) while (1) { void *addr = malloc(SIZE); assert(addr != NULL); + asm volatile("" :: "m" (addr)); free(addr); (*iterations)++; diff --git a/tests/malloc2.c b/tests/malloc2.c index c24aceb..e769dd3 100644 --- a/tests/malloc2.c +++ b/tests/malloc2.c @@ -12,6 +12,7 @@ void testcase(unsigned long long *iterations, unsigned long nr) while (1) { void *addr = malloc(SIZE); assert(addr != NULL); + asm volatile("" :: "m" (addr)); free(addr); (*iterations)++;