From nobody Wed Aug 27 19:22:09 2025 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4cBvYF31nKz65W6q; Wed, 27 Aug 2025 19:22:09 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R12" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4cBvYF1TVYz43gk; Wed, 27 Aug 2025 19:22:09 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1756322529; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=7MTtzovHBjtSIQOlWNCqWBf7UEed/SPfr1t9a2mgv3o=; b=PI+5GTl8vR2OHBAEafaSS7FF2UbwFw3+QqcKFXbisXX0k8qdFFkLtkVs1j1HvnzAqKfxeh PU9tgPx4RVvj8FTewzwUbG2EH92wUEGNzK3e3m4Cg5L+CxqsQ990KCSb+dY6YjQQuCvLtm DYYiL3QiVfMh7xIAzSNxpVbgzXx08hF67UabPcPCSPK8XvpNIzuqR4G7fknM7dws0Lo/Mm rZIIBCdGKIiSXwY+vOpxqEb5Na/xi9XwyTAPMN3GAaCZ4vyqHw6deVdO/yDUuSrcFb3iRJ SJaUkZezPb5I/YxESTbFjA6C8OHB0qEmDzmmvp4fJfk44xYlTNJhKjpAHxfC/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1756322529; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=7MTtzovHBjtSIQOlWNCqWBf7UEed/SPfr1t9a2mgv3o=; b=DpwC1/jTWtDQ0Do64Sm4prx+O3TvTIQyWGC/SEBbkym0SgPFwf82epMY5M9VlE2ugImWXF nTAea6YElxX0KgSn8mOTDv1Wm35I4duy4EBK4PkUSdZYesLt4+soO5vovS3i1meujsQscv TG4epUvpV/ajBbcE9Lz2KqkvHBP5mo8h28brIiXDK01q9ZbjRi6fuPo3HDBG+blLfmv7kg TVSRS93RrGYDojntXlM5/hUZ+xlRmIwNKtrAYw4Mydb2irZo2lhaHxle6MxSZsW75TGXS0 OYUxBlH+SF6qBSR7CLBwoHVudUr4tGRb/aZLcai5xF25w1gD0pEo5rn2eGf/XQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1756322529; a=rsa-sha256; cv=none; b=vvwzWyIeNmWzFkj21o9S5WZGSYYwJH89a1zE3pqbDWOym/ILyLmFjOBDFEhm1C6xCZa1+M fnG5SCEkQMZjPfXXPmFsKuw7FV4Npn3ks+WH4jZT2Jt3rJQNCHhk4h1M/3btNpnr7gAggY Xm1VsKY3WYol/0KrTZWc3LFBZY2+pwqlm9lJL/sMt2qf1PhMpdYc5QlcMeouvIy4E6Vguc t19vRV9csPlwAybUbpEEKCPxy6dCZhSOhBw2LvIMaYQZUiFyEe9RIKbT+Grxo1U54Eop2F nMZwnxZT70L0fAGbERL9R0Z1w06hOaBCsC/6eBTEYS4n7cPbcEb8OlxLd0aJig== ARC-Authentication-Results: i=1; mx1.freebsd.org; none Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4cBvYF0wXjzcCJ; Wed, 27 Aug 2025 19:22:09 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 57RJM97V040963; Wed, 27 Aug 2025 19:22:09 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 57RJM9T1040960; Wed, 27 Aug 2025 19:22:09 GMT (envelope-from git) Date: Wed, 27 Aug 2025 19:22:09 GMT Message-Id: <202508271922.57RJM9T1040960@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Dag-Erling =?utf-8?Q?Sm=C3=B8rgrav?= Subject: git: f1e93df44ba0 - stable/14 - libc: Drop incorrect qsort optimization List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: des X-Git-Repository: src X-Git-Refname: refs/heads/stable/14 X-Git-Reftype: branch X-Git-Commit: f1e93df44ba0b494c6eda01958a71fec501bc32b Auto-Submitted: auto-generated The branch stable/14 has been updated by des: URL: https://cgit.FreeBSD.org/src/commit/?id=f1e93df44ba0b494c6eda01958a71fec501bc32b commit f1e93df44ba0b494c6eda01958a71fec501bc32b Author: Dag-Erling Smørgrav AuthorDate: 2025-08-15 07:22:33 +0000 Commit: Dag-Erling Smørgrav CommitDate: 2025-08-27 18:49:58 +0000 libc: Drop incorrect qsort optimization As pointed out in the PR and the article linked below, the switch to insertion sort in the BSD qsort code is based on a misunderstanding of Knuth's TAOCP and is actually a pessimization. As demonstrated by the added test, it is trivially easy to construct pathological input which results in quadratic runtime. Without that misguided optimization, the same input runs in nearly linearithmic time. https://www.raygard.net/2022/02/26/Re-engineering-a-qsort-part-3 PR: 287089 MFC after: 1 week Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D51907 (cherry picked from commit 5205b32de3fb7702e96b3991f5b1a61eee406d8b) --- lib/libc/stdlib/qsort.c | 13 ----- lib/libc/tests/stdlib/Makefile | 1 + lib/libc/tests/stdlib/qsort_bench.c | 113 ++++++++++++++++++++++++++++++++++++ 3 files changed, 114 insertions(+), 13 deletions(-) diff --git a/lib/libc/stdlib/qsort.c b/lib/libc/stdlib/qsort.c index 22308887de1a..0735cf635b51 100644 --- a/lib/libc/stdlib/qsort.c +++ b/lib/libc/stdlib/qsort.c @@ -109,13 +109,11 @@ local_qsort(void *a, size_t n, size_t es, cmp_t *cmp, void *thunk) char *pa, *pb, *pc, *pd, *pl, *pm, *pn; size_t d1, d2; int cmp_result; - int swap_cnt; /* if there are less than 2 elements, then sorting is not needed */ if (__predict_false(n < 2)) return; loop: - swap_cnt = 0; if (n < 7) { for (pm = (char *)a + es; pm < (char *)a + n * es; pm += es) for (pl = pm; @@ -144,7 +142,6 @@ loop: for (;;) { while (pb <= pc && (cmp_result = CMP(thunk, pb, a)) <= 0) { if (cmp_result == 0) { - swap_cnt = 1; swapfunc(pa, pb, es); pa += es; } @@ -152,7 +149,6 @@ loop: } while (pb <= pc && (cmp_result = CMP(thunk, pc, a)) >= 0) { if (cmp_result == 0) { - swap_cnt = 1; swapfunc(pc, pd, es); pd -= es; } @@ -161,18 +157,9 @@ loop: if (pb > pc) break; swapfunc(pb, pc, es); - swap_cnt = 1; pb += es; pc -= es; } - if (swap_cnt == 0) { /* Switch to insertion sort */ - for (pm = (char *)a + es; pm < (char *)a + n * es; pm += es) - for (pl = pm; - pl > (char *)a && CMP(thunk, pl - es, pl) > 0; - pl -= es) - swapfunc(pl, pl - es, es); - return; - } pn = (char *)a + n * es; d1 = MIN(pa - (char *)a, pb - pa); diff --git a/lib/libc/tests/stdlib/Makefile b/lib/libc/tests/stdlib/Makefile index 974bbf7c0704..14f1ab022c37 100644 --- a/lib/libc/tests/stdlib/Makefile +++ b/lib/libc/tests/stdlib/Makefile @@ -13,6 +13,7 @@ ATF_TESTS_C+= qsort_b_test ATF_TESTS_C+= qsort_r_compat_test ATF_TESTS_C+= qsort_r_test ATF_TESTS_C+= qsort_s_test +ATF_TESTS_C+= qsort_bench ATF_TESTS_C+= quick_exit_test ATF_TESTS_C+= set_constraint_handler_s_test ATF_TESTS_C+= strfmon_test diff --git a/lib/libc/tests/stdlib/qsort_bench.c b/lib/libc/tests/stdlib/qsort_bench.c new file mode 100644 index 000000000000..5f2cfae40140 --- /dev/null +++ b/lib/libc/tests/stdlib/qsort_bench.c @@ -0,0 +1,113 @@ +/*- + * Copyright (c) 2025 Dag-Erling Smørgrav + * + * SPDX-License-Identifier: BSD-2-Clause + */ + +#include +#include +#include +#include +#include +#include + +/*- + * Measures qsort(3) runtime with pathological input and verify that it + * stays close to N * log2(N). + * + * Thanks to Vivian Hussey for the proof of concept. + * + * The input we construct is similar to a sweep from 0 to N where each + * half, except for the first element, has been reversed; for instance, + * with N = 8, we get { 0, 3, 2, 1, 4, 8, 7, 6 }. This triggers a bug in + * the BSD qsort(3) where it will switch to insertion sort if the pivots + * are sorted. + * + * This article goes into more detail about the bug and its origin: + * + * https://www.raygard.net/2022/02/26/Re-engineering-a-qsort-part-3 + * + * With this optimization (the `if (swap_cnt == 0)` block), qsort(3) needs + * roughly N * N / 4 comparisons to sort our pathological input. Without + * it, it needs only a little more than N * log2(N) comparisons. + */ + +/* we stop testing once a single takes longer than this */ +#define MAXRUNSECS 10 + +static bool debugging; + +static uintmax_t ncmp; + +static int +intcmp(const void *a, const void *b) +{ + ncmp++; + return ((*(int *)a > *(int *)b) - (*(int *)a < *(int *)b)); +} + +static void +qsort_bench(int log2n) +{ + uintmax_t n = 1LLU << log2n; + int *buf; + + /* fill an array with a pathological pattern */ + ATF_REQUIRE(buf = malloc(n * sizeof(*buf))); + buf[0] = 0; + buf[n / 2] = n / 2; + for (unsigned int i = 1; i < n / 2; i++) { + buf[i] = n / 2 - i; + buf[n / 2 + i] = n - i; + } + + ncmp = 0; + qsort(buf, n, sizeof(*buf), intcmp); + + /* check result and free array */ + if (debugging) { + for (unsigned int i = 1; i < n; i++) { + ATF_REQUIRE_MSG(buf[i] > buf[i - 1], + "array is not sorted"); + } + } + free(buf); + + /* check that runtime does not exceed N² */ + ATF_CHECK_MSG(ncmp / n < n, + "runtime %ju exceeds N² for N = %ju", ncmp, n); + + /* check that runtime does not exceed N log N by much */ + ATF_CHECK_MSG(ncmp / n <= log2n + 1, + "runtime %ju exceeds N log N for N = %ju", ncmp, n); +} + +ATF_TC_WITHOUT_HEAD(qsort_bench); +ATF_TC_BODY(qsort_bench, tc) +{ + struct timespec t0, t1; + uintmax_t tus; + + for (int i = 10; i <= 30; i++) { + clock_gettime(CLOCK_UPTIME, &t0); + qsort_bench(i); + clock_gettime(CLOCK_UPTIME, &t1); + tus = t1.tv_sec * 1000000 + t1.tv_nsec / 1000; + tus -= t0.tv_sec * 1000000 + t0.tv_nsec / 1000; + if (debugging) { + fprintf(stderr, "N = 2^%d in %ju.%06jus\n", + i, tus / 1000000, tus % 1000000); + } + /* stop once an individual run exceeds our limit */ + if (tus / 1000000 >= MAXRUNSECS) + break; + } +} + +ATF_TP_ADD_TCS(tp) +{ + debugging = !getenv("__RUNNING_INSIDE_ATF_RUN") && + isatty(STDERR_FILENO); + ATF_TP_ADD_TC(tp, qsort_bench); + return (atf_no_error()); +}