ULE steal_idle questions

Bruce Evans brde at optusnet.com.au
Sat Aug 26 00:28:55 UTC 2017


On Fri, 25 Aug 2017, Don Lewis wrote:

> ...
> Something else that I did not expect is the how frequently threads are
> stolen from the other SMT thread on the same core, even though I
> increased steal_thresh from 2 to 3 to account for the off-by-one
> problem.  This is true even right after the system has booted and no
> significant load has been applied.  My best guess is that because of
> affinity, both the parent and child processes run on the same CPU after
> fork(), and if a number of processes are forked() in quick succession,
> the run queue of that CPU can get really long.  Forcing a thread
> migration in exec() might be a good solution.

Since you are trying a lot of combinations, maybe you can tell us which
ones work best.  SCHED_4BSD works better for me on an old 2-core system.
SCHED_ULE works better on a not-so old 4x2 core (Haswell) system, but I 
don't like it due to its complexity.  It makes differences of at most
+-2% except when mistuned it can give -5% for real time (but better for
CPU and presumably power).

For SCHED_4BSD, I wrote fancy tuning for fork/exec and sometimes get
everything to like up for a 3% improvement (803 seconds instead of 823
on the old system, with -current much slower at 840+ and old versions
of ULE before steal_idle taking 890+).  This is very resource (mainly
cache associativity?) dependent and my tuning makes little difference
on the newer system.  SCHED_ULE still has bugfeatures which tend to
help large builds by reducing context switching, e.g., by bogusly
clamping all CPU-bound threads to nearly maximal priority.

Bruce


More information about the freebsd-arch mailing list