Re: Support for more than 256 CPU cores

From: Jan Bramkamp <crest_at_rlwinm.de>
Date: Sun, 07 May 2023 21:19:01 UTC
On 05.05.23 17:52, Hans Petter Selasky wrote:
> On 5/5/23 17:23, Tomek CEDRO wrote:
>> On Fri, May 5, 2023 at 3:38 PM Ed Maste wrote:
>>> FreeBSD supports up to 256 CPU cores in the default kernel 
>>> configuration
>>> (on Tier-1 architectures).  Systems with more than 256 cores are
>>> available now, and will become increasingly common over FreeBSD 14’s
>>> lifetime. (..)
>>
>> Congratulations! :-)
>>
>> I am looking after AMD Threadripper with 64 cores 2 threads each that
>> will give 128 CPU to the system.. maybe this year I could afford that
>> beast then I will report back after testing :-)
>>
>> In upcoming years variations of RISC-V will provide unheard before
>> number of CPU in a single SoC (i.e. 1000 CPU) at amazing power
>> efficiency and I saw reports of prototype with 3 x SoC of this kind on
>> a single board :-)
>>
>> https://spectrum.ieee.org/risc-v-ai
>>
>
> Hi,
>
> Maybe it makes sense to cluster CPU's in logical groups somehow. Some 
> synchronization mechanism like EPOCH() are O(N²) where N is the number 
> of CPUs. Not in the read-case, but in the synchronize case. It depends 
> a bit though. Currently EPOCH() is executed every kern.hz .

Unless the implementation scales quadratic with the number of CPU 
logical CPUs representable by cpuset_t instead of the number of CPUs 
available to the kernel it should still be worth it to change the ABI in 
time for 14.0 even if the existing implementation of EPOCH doesn't scale 
well. If the ABI isn't extended in time the whole 14.x major release 
line will either be limited to 256 logical CPUs, require users to change 
the type and recompile *everything*, or have to reinterpret cpuset_t as 
something other than a flat bitmap of cores.

In my opinion there are three important constraints:

  * Don't delay the 14.0 release more than necessary.

  * Don't penalize systems with 256 or less cores too much.

  * Make the struct big enough to allow a better implementation later.

Just increasing the bitset by a factor of four shouldn't be to 
expensive, but is it enough to allow the implementation of algorithms 
that scale better or is additional storage needed. e.g. a bit for each 
logical group at each level of grouping?