Re: Periodic rant about SCHED_ULE

From: Mark Millard <marklmi_at_yahoo.com>
Date: Thu, 23 Mar 2023 22:06:14 UTC
On Mar 23, 2023, at 13:53, Warner Losh <imp@bsdimp.com> wrote:
> 
>> On Thu, Mar 23, 2023, 9:46 PM Mark Millard <marklmi@yahoo.com> wrote:
>> Warner Losh <imp_at_bsdimp.com> wrote on
>> Date: Wed, 22 Mar 2023 22:57:08 UTC :
>> 
>> > On Wed, Mar 22, 2023 at 1:41 PM George Mitchell <george+freebsd@m5p.com>
>> > wrote:
>> > 
>> > > service dnetc start
>> > > I am literally running "make buildworld" with no additional options.
>> > >
>> > >
>> > So what are the results for make buildworld -j $(sysctl -n hw.ncpu )?
>> 
>> 
>> Note: My experiments have been in this -j $(sysctl -n hw.ncpu )
>> realm.
>> 
>> > ULE scales much better, but when there's too little to do it can make poor
>> > choices.
>> > 
>> > ULE is better locked and don't fall over on high core count systems like
>> > BSD does at moderate load.
>> 
>> (I'm presuming the above is not about the specifics
>> of the effectively different interpretations of the
>> likes of having extra "nice 20" activity by the two
>> schedulers for the examples related to the original
>> "rant", other than the -jN issue.)
>> 
>> Any idea on what scale "high core count systems" need to be
>> for what sort of "moderate load" to end up with signficant
>> differences?
> 
> Sched_bsd is basically unusable on my 64 core 128 thread machine with make -j 150 (nice or no).

So, well beyond what I've access to, even if 32 core 64 thread
would also show such issues.

> With ULE I don't notice. That's not to say ule can't be better (me not noticing is hardly scientific), but I tried sched bsd when I got the thread ripper and found the machine too unresponsive when I was doing large builds... 

So the classification is based on responsiveness. Good to know.

(I assume you had avoided having interactive processes end up
with their kernel stacks swapped out. That is a separate sort
of issue.)

> But I wasn't playing video on this box... so maybe I hit a local optimal point...

No X11 or such invovled in my context. Mostly ssh over EtherNet.
I'm rarely at the (video) console (no serial console). I did not
see responsiveness issues in the ssh sessions during my
activities with the 2 schedulers. (So I'd not explicitly
commented about such at the time.)

Thanks for reporting the example.

> Warner
> 
>> What sort of context(s) show ULE scaling much
>> better? On the 16 core ThreadRipper 1950X (32 hardware
>> threads) I've really only demonstrated the "nice 20"
>> distinction as significant between the schedulers so far.
>> (I do not have acccess to anything with more hardware threads.)
>> 
>> Note: I've not (yet?) been looking at having just a little
>> more than the number of hardware threads active (no nice
>> involvement).
> 


Side note: In order to be sure to avoid interactive
processes ending up with their kernel threads swapped
out, I use in /etc/sysctl.conf (which prevents far more
from ending up that way):

#
# Together this pair avoids swapping out the process kernel stacks.
# This avoids processes for interacting with the system from being
# hung-up by such.
vm.swap_enabled=0
vm.swap_idle_enabled=0

While I have such everywhere, only some machines and
a rare type of use on them is actually likely need to
above in my context. After adding the above, I've
never had the loss of access problem again.

But nothing I've done would indicate much about X11
use reasonableness, for example.

===
Mark Millard
marklmi at yahoo.com