SCHED_ULE should not be the default

Adrian Chadd adrian at freebsd.org
Fri Dec 23 00:23:30 UTC 2011


On 22 December 2011 11:47, Steve Kargl <sgk at troutmask.apl.washington.edu> wrote:

[snip]

Thankyou for posting some actual measurements!

> There is the additional observation in one of my 2008
> emails (URLs have been posted) that if you have N+1
> cpu-bound jobs with, say, job0 and job1 ping-ponging
> on cpu0 (due to ULE's cpu-affinity feature) and if I
> kill job2 running on cpu1, then neither job0 nor job1
> will migrate to cpu1.  So, one now has N cpu-bound
> jobs running on N-1 cpus.

.. and this sounds like a pretty serious regression. Have you ever
filed a PR for it?

> Finally, my initial post in this email thread was to
> tell O. Hartman to quit beating his head against
> a wall with ULE (in an HPC environment).  Switch to
> 4BSD.  This was based on my 2008 observations and
> I've now wasted 2 days gather additional information
> which only re-affirms my recommendation.

I personally don't think this is time wasted. You've done something
that noone else has actually done - provided actual results from
real-life testing, rather than a hundred posts of "I remember seeing
X, so I don't use ULE."

If you can definitely and consistently reproduce that N-1 cpu bound
job bug, you're now in a great position to easily test and re-report
KTR/schedtrace results to see what impact they have. Please don't
underestimate exactly how valuable this is.

How often are those two jobs migrating between CPUs? How am I supposed
to read "CPU load" ? Why isn't it just sitting at 100% the whole time?

Would you mind repeating this with 4BSD (the N+1 jobs) so we can see
how the jobs are scheduled/interleaved? Something tells me we'll see
it the jobs being scheduled evenly


Adrian


More information about the freebsd-stable mailing list