SCHED_ULE on sparc64

Jeff Roberson jroberson at jroberson.net
Fri May 20 21:41:08 UTC 2011


On Fri, 20 May 2011, Marius Strobl wrote:

> On Fri, May 20, 2011 at 10:41:02PM +1000, Peter Jeremy wrote:
>> On 2011-May-20 12:38:41 +0200, Marius Strobl <marius at alchemy.franken.de> wrote:
>>> The main problem with SCHED_ULE on sparc64 is that the MD code
>>> (ab)uses the global sched_lock of SCHED_4BSD to protect pm_context,
>>> pm_active and pc_pmap, partially of all CPUs, and SCHED_ULE doesn't
>>> use/provide such a lock. One could replace the use of sched_lock
>>> for that with a global MD spin lock but this has the issue that it
>>> would have to be acquired and released in cpu_switch(), which is next
>>> to impossible to do properly in assembler.
>>
>> Definitely messy but MIPS and PPC do it (at least the acquire - I
>> don't see how the lock is released in either case).
>
> I don't think these actually acquire a lock, all lock-related I can
> identify there are the equivalents of the following:
> 	atomic_store_rel_ptr(&old->td_lock, mtx);
> and:
> #if defined(SCHED_ULE) && defined(SMP)
> 		while (atomic_load_acq_ptr(&new->td_lock) == &blocked_lock)
> 			cpu_spinwait();
> #endif

Yes the goal of passing the lock pointer into the switch function is so 
that the outgoing thread's lock is not released until we are off of its 
stack.  Otherwise another cpu could start switching into it as we are on 
the way out.

>
>>> The bottom line
>>> is that watching the various mailing lists so far didn't provide the
>>> necessary motivation to work on that to me though (even today you still
>>> find reports about performance problems with SCHED_ULE and suggestions
>>> to use SCHED_4BSD instead, just see 4DD55CE0.50202 at m5p.com as current
>>> example).

Can you give me another reference to this?  You have to realize that no 
scheduling policy will be faster for everything.  The goal is to be faster 
for most things and eliminate worst case scenarios.  I can look at this 
soon if there is something to be done.

>>
>> OTOH, not using it won't get the bugs fixed.
>
> They certainly won't but typically I hit enough problems when trying to
> get code developed on x86 or actually written with only x86 in mind to
> work on sparc64 that I don't really feel the desire to go out hunting for
> generic bugs in that code. In any case my motivation for getting SCHED_ULE
> to work on sparc64 suddenly vanished with r171488 for some strange reason.

I really don't know what the status of the sparc64 port is.  If it is 
intended to be first tier it should support ULE.  Features like cpusets 
and topology aware scheduling are better supported on ULE.  It is 
generally considered the path forward for SMP.  4BSD with its global run 
queue and global lock is a dead end unless someone wants to salvage its 
priority computation mechanism and add the cpu load balancing features 
that end up making ULE slower in some cases.

>
>> My rationale for firing
>> up the spare V890 at $work was to try and stress some of the big
>> systems code and SCHED_ULE is supposed to be better at handling lots of
>> CPUs than SCHED_4BSD.
>>
>
> I don't think 16 cores counts as a lot these days :)

The per-cpu scheduler locks showed massive improvements on some workloads 
with only 4 cores.  The global scheduler lock is a significant point of 
contention probably for any workload at 16 cores.  16 cores is not a big 
machine anymore but it's plenty to have heavy contention soak up too many 
cycles.

Thanks,
Jeff

>
> Marius
>


More information about the freebsd-sparc64 mailing list