From nobody Fri Aug 15 07:23:09 2025 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4c3D9B1ZPBz64hwb; Fri, 15 Aug 2025 07:23:10 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4c3D9B0Q1xz48jT; Fri, 15 Aug 2025 07:23:10 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1755242590; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=OP8LyjjHw0W79gNIVBkFfly3+CNvveOtX/t1oauklF0=; b=pNei/Ow7gCyyR5qe7lOsHPBS5+/fNdaZG0YkJIGaxbuJuHOYPemif23X4WoKZaUZO+VtP9 bt5XWyh7j65MFIs1hO0aGO5+qMmpXMoaRi4MwPx15E9WzRJcoGcAnl38zOYqTJPjtaNJFc dUwrwoePHuWUYt9S5GHfxooeaGeF5+0Gu8qKCE8uq+s+bUG4fPmHWWbasWDmxe3s6s24Xe 41m5rUuWGWDJ+h6PtuNIow+1Zztc/oZR9Kk7KMN/likpFWBEzqNB4nfPFYiVK2xxFYpUI6 /40q6hqcR0qC+Q9heRw/Rx45hbCbVJQDQECmQMNDGFKSy0gENpcw7MWVmebLSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1755242590; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=OP8LyjjHw0W79gNIVBkFfly3+CNvveOtX/t1oauklF0=; b=XZsq7RcprQ0NuMLbXvhHO+HPeFApTUsAoIJR6Rw2SGJBXwdRwHEX4vPCnibDZVx8/ZlU9Y adb5rrWbwIzCt1GZ6K5psr0pzaUOzCdQ9C/F5XeG05kmRswQe78ctRtNPB4aBUb+4w0Lu/ imnmgqrqdc1k+5Gf8CrV9I5LmO4OrX6gkbRPh+IS4monojTA8c/S9MrsXA9HK/6Tt2JYNV 3B5twrxKe1dseYnlHqXi5e9vwCHmSOvgWb1d3k+xkLjM6r9/FRMtOPufl8SLjlgnt91zZK dwdN5Km2V3TRzGkx9VdLRrdw5l9E0hGVpEorhpUhDaZiIh90MXMtZJh0Xa6TzQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1755242590; a=rsa-sha256; cv=none; b=H+JJ06aHPAfP1STeUC6mAC0+BtmhP9nA6VYORo7yOoXbDgXVp3hGhuK8ipbFeEInvSEvsg 8hPKEX+twSmN6xum/vvV+oXsrZn1r2rQAvOenYL5AFEevqExVx1+pq88Nrue8p+dcXuf9z uwMOqLcZhrAY29XCf6filke/YkyZXIqHll02NsqzxE2mvsW4TxhrbdOylLUDxLpexO2DC2 0Bw37pzc/n0XwoFqaDUDizkRZhJvA0uE48Qd4wU0KrFvtiQeDUufjcTVX3YCFHUQ/eoZhD bxx+8zOmx3PnbftGmVX8SFyRcH2jGyedFY5VmVrP3LRc8J9pUsPi7ahByVEIiQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4c3D996wkCzD94; Fri, 15 Aug 2025 07:23:09 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 57F7N9vx007861; Fri, 15 Aug 2025 07:23:09 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 57F7N9e8007858; Fri, 15 Aug 2025 07:23:09 GMT (envelope-from git) Date: Fri, 15 Aug 2025 07:23:09 GMT Message-Id: <202508150723.57F7N9e8007858@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Dag-Erling =?utf-8?Q?Sm=C3=B8rgrav?= Subject: git: 5205b32de3fb - main - libc: Drop incorrect qsort optimization List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-main@freebsd.org Sender: owner-dev-commits-src-main@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: des X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 5205b32de3fb7702e96b3991f5b1a61eee406d8b Auto-Submitted: auto-generated The branch main has been updated by des: URL: https://cgit.FreeBSD.org/src/commit/?id=5205b32de3fb7702e96b3991f5b1a61eee406d8b commit 5205b32de3fb7702e96b3991f5b1a61eee406d8b Author: Dag-Erling Smørgrav AuthorDate: 2025-08-15 07:22:33 +0000 Commit: Dag-Erling Smørgrav CommitDate: 2025-08-15 07:23:03 +0000 libc: Drop incorrect qsort optimization As pointed out in the PR and the article linked below, the switch to insertion sort in the BSD qsort code is based on a misunderstanding of Knuth's TAOCP and is actually a pessimization. As demonstrated by the added test, it is trivially easy to construct pathological input which results in quadratic runtime. Without that misguided optimization, the same input runs in nearly linearithmic time. https://www.raygard.net/2022/02/26/Re-engineering-a-qsort-part-3 PR: 287089 MFC after: 1 week Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D51907 --- lib/libc/stdlib/qsort.c | 13 ----- lib/libc/tests/stdlib/Makefile | 1 + lib/libc/tests/stdlib/qsort_bench.c | 113 ++++++++++++++++++++++++++++++++++++ 3 files changed, 114 insertions(+), 13 deletions(-) diff --git a/lib/libc/stdlib/qsort.c b/lib/libc/stdlib/qsort.c index e0b06494cf98..400124593d07 100644 --- a/lib/libc/stdlib/qsort.c +++ b/lib/libc/stdlib/qsort.c @@ -106,13 +106,11 @@ local_qsort(void *a, size_t n, size_t es, cmp_t *cmp, void *thunk) char *pa, *pb, *pc, *pd, *pl, *pm, *pn; size_t d1, d2; int cmp_result; - int swap_cnt; /* if there are less than 2 elements, then sorting is not needed */ if (__predict_false(n < 2)) return; loop: - swap_cnt = 0; if (n < 7) { for (pm = (char *)a + es; pm < (char *)a + n * es; pm += es) for (pl = pm; @@ -141,7 +139,6 @@ loop: for (;;) { while (pb <= pc && (cmp_result = CMP(thunk, pb, a)) <= 0) { if (cmp_result == 0) { - swap_cnt = 1; swapfunc(pa, pb, es); pa += es; } @@ -149,7 +146,6 @@ loop: } while (pb <= pc && (cmp_result = CMP(thunk, pc, a)) >= 0) { if (cmp_result == 0) { - swap_cnt = 1; swapfunc(pc, pd, es); pd -= es; } @@ -158,18 +154,9 @@ loop: if (pb > pc) break; swapfunc(pb, pc, es); - swap_cnt = 1; pb += es; pc -= es; } - if (swap_cnt == 0) { /* Switch to insertion sort */ - for (pm = (char *)a + es; pm < (char *)a + n * es; pm += es) - for (pl = pm; - pl > (char *)a && CMP(thunk, pl - es, pl) > 0; - pl -= es) - swapfunc(pl, pl - es, es); - return; - } pn = (char *)a + n * es; d1 = MIN(pa - (char *)a, pb - pa); diff --git a/lib/libc/tests/stdlib/Makefile b/lib/libc/tests/stdlib/Makefile index 50726a5d8af6..9d84becfbd1f 100644 --- a/lib/libc/tests/stdlib/Makefile +++ b/lib/libc/tests/stdlib/Makefile @@ -14,6 +14,7 @@ ATF_TESTS_C+= qsort_b_test ATF_TESTS_C+= qsort_r_compat_test ATF_TESTS_C+= qsort_r_test ATF_TESTS_C+= qsort_s_test +ATF_TESTS_C+= qsort_bench ATF_TESTS_C+= set_constraint_handler_s_test ATF_TESTS_C+= strfmon_test ATF_TESTS_C+= tsearch_test diff --git a/lib/libc/tests/stdlib/qsort_bench.c b/lib/libc/tests/stdlib/qsort_bench.c new file mode 100644 index 000000000000..5f2cfae40140 --- /dev/null +++ b/lib/libc/tests/stdlib/qsort_bench.c @@ -0,0 +1,113 @@ +/*- + * Copyright (c) 2025 Dag-Erling Smørgrav + * + * SPDX-License-Identifier: BSD-2-Clause + */ + +#include +#include +#include +#include +#include +#include + +/*- + * Measures qsort(3) runtime with pathological input and verify that it + * stays close to N * log2(N). + * + * Thanks to Vivian Hussey for the proof of concept. + * + * The input we construct is similar to a sweep from 0 to N where each + * half, except for the first element, has been reversed; for instance, + * with N = 8, we get { 0, 3, 2, 1, 4, 8, 7, 6 }. This triggers a bug in + * the BSD qsort(3) where it will switch to insertion sort if the pivots + * are sorted. + * + * This article goes into more detail about the bug and its origin: + * + * https://www.raygard.net/2022/02/26/Re-engineering-a-qsort-part-3 + * + * With this optimization (the `if (swap_cnt == 0)` block), qsort(3) needs + * roughly N * N / 4 comparisons to sort our pathological input. Without + * it, it needs only a little more than N * log2(N) comparisons. + */ + +/* we stop testing once a single takes longer than this */ +#define MAXRUNSECS 10 + +static bool debugging; + +static uintmax_t ncmp; + +static int +intcmp(const void *a, const void *b) +{ + ncmp++; + return ((*(int *)a > *(int *)b) - (*(int *)a < *(int *)b)); +} + +static void +qsort_bench(int log2n) +{ + uintmax_t n = 1LLU << log2n; + int *buf; + + /* fill an array with a pathological pattern */ + ATF_REQUIRE(buf = malloc(n * sizeof(*buf))); + buf[0] = 0; + buf[n / 2] = n / 2; + for (unsigned int i = 1; i < n / 2; i++) { + buf[i] = n / 2 - i; + buf[n / 2 + i] = n - i; + } + + ncmp = 0; + qsort(buf, n, sizeof(*buf), intcmp); + + /* check result and free array */ + if (debugging) { + for (unsigned int i = 1; i < n; i++) { + ATF_REQUIRE_MSG(buf[i] > buf[i - 1], + "array is not sorted"); + } + } + free(buf); + + /* check that runtime does not exceed N² */ + ATF_CHECK_MSG(ncmp / n < n, + "runtime %ju exceeds N² for N = %ju", ncmp, n); + + /* check that runtime does not exceed N log N by much */ + ATF_CHECK_MSG(ncmp / n <= log2n + 1, + "runtime %ju exceeds N log N for N = %ju", ncmp, n); +} + +ATF_TC_WITHOUT_HEAD(qsort_bench); +ATF_TC_BODY(qsort_bench, tc) +{ + struct timespec t0, t1; + uintmax_t tus; + + for (int i = 10; i <= 30; i++) { + clock_gettime(CLOCK_UPTIME, &t0); + qsort_bench(i); + clock_gettime(CLOCK_UPTIME, &t1); + tus = t1.tv_sec * 1000000 + t1.tv_nsec / 1000; + tus -= t0.tv_sec * 1000000 + t0.tv_nsec / 1000; + if (debugging) { + fprintf(stderr, "N = 2^%d in %ju.%06jus\n", + i, tus / 1000000, tus % 1000000); + } + /* stop once an individual run exceeds our limit */ + if (tus / 1000000 >= MAXRUNSECS) + break; + } +} + +ATF_TP_ADD_TCS(tp) +{ + debugging = !getenv("__RUNNING_INSIDE_ATF_RUN") && + isatty(STDERR_FILENO); + ATF_TP_ADD_TC(tp, qsort_bench); + return (atf_no_error()); +}