SCHED_ULE should not be the default

Tue Dec 13 15:54:57 UTC 2011

On Tue, Dec 13, 2011 at 02:23:46PM +0100, O. Hartmann wrote:
> On 12/12/11 16:51, Steve Kargl wrote:
> > On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote:
> >>
> >>> Not fully right, boinc defaults to run on idprio 31 so this isn't an
> >>> issue. And yes, there are cases where SCHED_ULE shows much better
> >>> performance then SCHED_4BSD.  [...]
> >>
> >> Do we have any proof at hand for such cases where SCHED_ULE performs
> >> much better than SCHED_4BSD? Whenever the subject comes up, it is
> >> mentioned, that SCHED_ULE has better performance on boxes with a ncpu >
> >> 2. But in the end I see here contradictionary statements. People
> >> complain about poor performance (especially in scientific environments),
> >> and other give contra not being the case.
> >>
> >> Within our department, we developed a highly scalable code for planetary
> >> science purposes on imagery. It utilizes present GPUs via OpenCL if
> >> present. Otherwise it grabs as many cores as it can.
> >> By the end of this year I'll get a new desktop box based on Intels new
> >> Sandy Bridge-E architecture with plenty of memory. If the colleague who
> >> developed the code is willing performing some benchmarks on the same
> >> hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
> >> recent Suse. For FreeBSD I intent also to look for performance with both
> >> different schedulers available.
> >>
> > 
> > This comes up every 9 months or so, and must be approaching
> > FAQ status.
> > 
> > In a HPC environment, I recommend 4BSD.  Depending on
> > the workload, ULE can cause a severe increase in turn
> > around time when doing already long computations.  If
> > you have an MPI application, simply launching greater
> > than ncpu+1 jobs can show the problem.
> 
> Well, those recommendations should based on "WHY". As the mostly
> negative experiences with SCHED_ULE in highly computative workloads get
> allways contradicted by "...but there are workloads that show the
> opposite ..." this should be shown by more recent benchmarks and
> explanations than legacy benchmarks from years ago.
> 

I have given the WHY in previous discussions of ULE, based
on what you call legacy benchmarks.  I have not seen any
commit to sched_ule.c that would lead me to believe that
the performance issues with ULE and cpu-bound numerical
codes have been addressed.  Repeating the benchmark would
be a waste of time.

-- 
Steve