git: aefe0a8c32d3 - main - Refactor/optimize cpu_search_*().

Thu Jul 29 06:07:57 UTC 2021

On Wed, Jul 28, 2021 at 7:00 PM Alexander Motin <mav at freebsd.org> wrote:

> The branch main has been updated by mav:
>
> URL:
> https://cgit.FreeBSD.org/src/commit/?id=aefe0a8c32d370f2fdd0d0771eb59f8845beda17
>
> commit aefe0a8c32d370f2fdd0d0771eb59f8845beda17
> Author:     Alexander Motin <mav at FreeBSD.org>
> AuthorDate: 2021-07-29 01:18:50 +0000
> Commit:     Alexander Motin <mav at FreeBSD.org>
> CommitDate: 2021-07-29 02:00:29 +0000
>
>     Refactor/optimize cpu_search_*().
>
>     Remove cpu_search_both(), unused for many years.  Without it there is
>     less sense for the trick of compiling common cpu_search() into separate
>     cpu_search_lowest() and cpu_search_highest(), so split them completely,
>     making code more readable.  While there, split iteration over children
>     groups and CPUs, complicating code for very small deduplication.
>
>     Stop passing cpuset_t arguments by value and avoid some manipulations.
>     Since MAXCPU bump from 64 to 256, what was a single register turned
>     into 32-byte memory array, requiring memory allocation and accesses.
>     Splitting struct cpu_search into parameter and result parts allows to
>     even more reduce stack usage, since the first can be passed through
>     on recursion.
>
>     Remove CPU_FFS() from the hot paths, precalculating first and last CPU
>     for each CPU group in advance during initialization.  Again, it was
>     not a problem for 64 CPUs before, but for 256 FFS needs much more code.
>
>     With these changes on 80-thread system doing ~260K uncached ZFS reads
>     per second I observe ~30% reduction of time spent in cpu_search_*().

Nice! I recall seeing contention here on other workloads on high core count
systems.

Regards,
Kevin