On schedulers

Mon Aug 6 23:22:11 UTC 2007

On Sun, 5 Aug 2007, Ivan Voras wrote:

> On Fri, 3 Aug 2007, Jeff Roberson wrote:
>
>> On Thu, 2 Aug 2007, Jeff Roberson wrote:
>> 
>>> On Thu, 2 Aug 2007, Niki Denev wrote:
>>> 
>>>> Both idle and glxgears are run as normal user.
>>> 
>>> Can you tell me what % cpu is going to each process during this time? 
>>> These results are surprising.  For workloads like this ULE should 
>>> essentially implement a 'fair' scheduling policy.  However, so should 
>>> 4BSD.  So I'm not yet sure why the slowdown wouldn't be relative to the 
>>> number of running threads.  Also, 'vmstat 1' output would be useful.
>
> I'm glad this discussion is happening, but:
>
> - I wasn't really interested in 3D performance, but mostly in if there's 
> theoretical modelling of how ULE should perform, and/or its comparison to 
> Linux (e.g. elaboration of what 'fair' means for ULE).

Well you have to put these discussions of 'completely fair' schedulers in 
the proper context.  By the CFS definition, 4BSD is 'completely fair'. 
That is it attempts to give all processes an equal fraction of the CPU 
within a given period (ignoring nice).  ULE essentially splits timesharing 
into two classes seperated by a tunable interactivity heuristic.

If a thread is not determined to be interactive it is scheduled fairly 
with all other threads.  If it is, it gets something more similar to a 
real-time priority, however still with 100ms slices.  An interactive task 
which uses too much CPU will be bounced back to split time evenly. 
Interactive tasks are scheduled 'fairly' relative to each other.  That is, 
they will split CPU fairly among themselves as long as they remain 
interactive.

The interactive heuristic is simple.  Fairness is determined by a history 
of runtime in 4BSD, ULE, and CFS.  In ULE we keep track seperately of 
voluntary sleep time vs runtime.  Voluntary sleeptime does not account for 
time waiting on the run-queue.  This simple heuristic seems to work out 
well for ULE even with short bursts of activity for otherwise idling 
threads (ie rendering a page in firefox).  The first heuristic is really 
%cpu over the last N seconds, the second is runtime/sleeptime over the 
last Y seconds.

The one potential problem is that many 'interactive' tasks could starve 
non-interactive tasks for CPU time.  In this case you can tune down the 
interactivity threshold, or it could be disabled all together, giving 
results similar to 4BSD/CFS.  See kern.sched.interact.

> - People who know (meaning those who work with or develop X11) say that 
> glxgears is awful for testing graphical performance. I don't know exactly why 
> is that, but I've seen widely varying results from glxgears on related 
> mailing lists that seem to confirm this. From personal experience I've seen 
> glxgears "topping out" with much idle CPU left, both extremely high and 
> extremely low results from it on hardware that shouldn't behave like that, so 
> I agree with this. Quake should be much better for benchmarking :)
>

Yes, I'd be interested in seeing an apples to apples comparison with 
quake.  Although I don't know how our hardware 3d support compares to 
Linux.  I have done some comparisons myself.  For example; running a -j32 
compile while watching a movie and using a webbrowser on a single 
processor laptop yields no lag in the movie or browser for me with ULE. 
With linux I find the system mostly unusable.  This is completely 
unscientific however.

Thanks,
Jeff