SCHED_ULE should not be the default
Pieter de Goeje
pieter at degoeje.nl
Mon Dec 12 16:31:01 UTC 2011
On Monday 12 December 2011 14:47:57 O. Hartmann wrote:
> > Not fully right, boinc defaults to run on idprio 31 so this isn't an
> > issue. And yes, there are cases where SCHED_ULE shows much better
> > performance then SCHED_4BSD. [...]
> Do we have any proof at hand for such cases where SCHED_ULE performs
> much better than SCHED_4BSD? Whenever the subject comes up, it is
> mentioned, that SCHED_ULE has better performance on boxes with a ncpu >
> 2. But in the end I see here contradictionary statements. People
> complain about poor performance (especially in scientific environments),
> and other give contra not being the case.
> Within our department, we developed a highly scalable code for planetary
> science purposes on imagery. It utilizes present GPUs via OpenCL if
> present. Otherwise it grabs as many cores as it can.
> By the end of this year I'll get a new desktop box based on Intels new
> Sandy Bridge-E architecture with plenty of memory. If the colleague who
> developed the code is willing performing some benchmarks on the same
> hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
> recent Suse. For FreeBSD I intent also to look for performance with both
> different schedulers available.
In my spare time I do some stuff which can be considered "HPC". If I recall
correctly the most loud supporters of the notion that SCHED_BSD is faster
than SCHED_ULE are using more threads than there are cores, causing CPU core
contention and more importantly unevenly distributed runtimes among threads,
resulting in suboptimal execution times for their programs. Since I've never
actually seen that code in question it's hard to say whether or not
this "unfair" distribution actually results in lower throughput or that it
simply violates an assumption in the code that each thread takes about as
long to finish its task.
Although I haven't actually benchmarked the two schedulers directly, I have no
reason to suspect SCHED_ULE of suboptimal performance because:
1) A program model where there are N threads on N cores which take work items
from a shared queue until it is empty has almost perfect scaling on SCHED_ULE
(I get 398% CPU usage on a quadcore)
2) The same program on Linux (dual boot) compiled with exactly the same
compiler and flags runs slightly slower. I think this has to do with VM
What I'm trying to say is that until someone actually shows some code which
has demonstrably lower performance on SCHED_ULE and this is not caused by
IMHO improper timing dependencies between threads I'd say that there is no
cause for concern here. I actually expect performance differences between the
two schedulers to show in problems which cause a lot more contention on the
CPU cores and use lots of locks internally so threads are frequently waiting
on each other, for instance the MySQL benchmarks done a couple of years ago
by Kris Kennaway.
Aside from algorithmic limitations (SCHED_BSD doesn't really scale all that
well), there will always exist some problems in which SCHED_BSD is faster
because it by chance has a better execution order for these problems... The
good thing is people have a choice :-).
I'm looking forward to the results of your benchmark.
Pieter de Goeje
More information about the freebsd-stable