cvs commit: src/sys/kern sched_ule.c
Jeff Roberson
jroberson at chesapeake.net
Sun Mar 2 08:27:44 UTC 2008
On Sun, 2 Mar 2008, Jeff Roberson wrote:
> jeff 2008-03-02 08:20:59 UTC
>
> FreeBSD src repository
>
> Modified files:
> sys/kern sched_ule.c
> Log:
> Add support for the new cpu topology api:
> - When searching for affinity search backwards in the tree from the last
> cpu we ran on while the thread still has affinity for the group. This
> can take advantage of knowledge of shared L2 or L3 caches among a
> group of cores.
> - When searching for the least loaded cpu find the least loaded cpu via
> the least loaded path through the tree. This load balances system bus
> links, individual cache levels, and hyper-threaded/SMT cores.
> - Make the periodic balancer recursively balance the highest and lowest
> loaded cpu across each link.
>
> Add support for cpusets:
> - Convert the cpuset to a simple native cpumask_t while the kernel still
> only supports cpumask.
> - Pass the derived cpumask down through the cpu_search functions to
> restrict the result cpus.
> - Make the various steal functions resilient to failure since all threads
> can not run on all cpus any longer.
>
> General improvements:
> - Precisely track the lowest priority thread on every runq with
> tdq_setlowpri(). Before it was more advisory but this ended up having
> pathological behaviors.
> - Remove many #ifdef SMP conditions to simplify the code.
> - Get rid of the old cumbersome tdq_group. This is more naturally
> expressed via the cpu_group tree.
>
With these changes ULE is the only scheduler that supports the new cpuset
api. It succeeds on 4BSD but the scheduler doesn't obey the masks.
I don't presently have a plan to implement it on 4BSD as it will be
potentially very inefficient to search the runq for a compatible thread on
every context switch. I won't object if someone else wants to implement
this, otherwise I'll make the syscalls return ENOSYS if 4BSD is compiled
in.
The improved cpu topology load balancing is a mixed bag. On some
workloads we see considerable improvements. Right now mysql suffers when
it has large numbers of threads but other things seem much improved. I
will be continuing to tune this however and in most cases it's a win
already.
Kris has done some excellent benchmarking as usual. Here you can see the
improvement in postgres depending on various scheduler debug settings:
http://people.freebsd.org/~kris/scaling/pgsql-16cpu.png
The horrible green line is 7.0 for reference. The blue line is the same
16core machine with half of the cores disabled.
Thanks,
Jeff
> Sponsored by: Nokia
> Testing by: kris
>
> Revision Changes Path
> 1.226 +443 -501 src/sys/kern/sched_ule.c
>
More information about the cvs-all
mailing list