Re: Periodic rant about SCHED_ULE

From: Mark Millard <marklmi_at_yahoo.com>
Date: Thu, 23 Mar 2023 01:08:54 UTC
On Mar 22, 2023, at 18:03, Mark Millard <marklmi@yahoo.com> wrote:

> On Mar 22, 2023, at 16:17, Mark Millard <marklmi@yahoo.com> wrote:
> 
>> On Mar 22, 2023, at 15:39, Mark Millard <marklmi@yahoo.com> wrote:
>> 
>>> On Mar 22, 2023, at 13:34, Mark Millard <marklmi@yahoo.com> wrote:
>>> 
>>>> On Mar 22, 2023, at 12:40, George Mitchell <george+freebsd@m5p.com> wrote:
>>>> 
>>>>> On 3/22/23 15:21, Mark Millard wrote:
>>>>>> George Mitchell <george+freebsd@m5p.com> wrote on
>>>>>> Date: Wed, 22 Mar 2023 17:36:39 UTC :
>>>>>> [...]
>>>>>>> Here are the very complicated instructions for reproducing the problem:
>>>>>>> 1. Install and start misc/dnetc from ports.
>>>>>> Installing is likely easy, as likely would be building
>>>>>> with default options (if any). I know nothing about
>>>>>> starting misc/dnetc so that is research. (Possibly
>>>>>> trivial, although if it has alternatives to control
>>>>>> then I'd need to match that context too.)
>>>>> 
>>>>> service dnetc start
>>>> 
>>>> I built and installed misc/dnetc and got a binary
>>>> blob that clearly was not built in my environment:
>>>> 
>>>> # file /usr/local/distributed.net/dnetc
>>>> /usr/local/distributed.net/dnetc: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), statically linked, for FreeBSD 10.1 (1001515), FreeBSD-style, stripped
>>>> 
>>>> Way older FreeBSD vintage than the locally available toolchains
>>>> would normally build. Some might be cautious about such a thing.
>>>> 
>>>> The man page reported that:
>>>> 
>>>> QUOTE
>>>>  If you have never run the client before, it will initiate the menu-driven
>>>>  configuration. Save and quit when done, the configuration file will be
>>>>  saved in the same directory as the client. Now, simply restart the
>>>>  client. From that point on it will use the saved configuration.
>>>> END QUOTE
>>>> 
>>>> I've not seen what the configuration asks about yet.
>>> 
>>> I went through the configuration, basically just looking
>>> at it, other than providing an E-mail address. Then . . .
>>> 
>>> $ sudo service dnetc start
>>> Password:
>>> Cannot 'start' dnetc. Set dnetc_enable to YES in /etc/rc.conf or use 'onestart' instead of 'start'.
>>> 
>>> $ sudo service dnetc onestart
>>> 
>>> I just let it run without any extra competing activity, other
>>> than I had my patched version of top running. It records and
>>> reports various maximum-observed (MaxObs) figures, here
>>> the load averages being relevant.
>>> 
>>> Top showed that dnetc started 32 processes, one per hardware
>>> thread. Mostly I saw: 100% nice and 0% idle.
>>> 
>>> Letting it run and then looking at the load averages (and
>>> their matching MaxObs figures) after something like 60+ min
>>> (not carefully timed: was doing other things) showed:
>>> 
>>> load averages:  31.97,  31.88,  31.66 MaxObs:  32.12,  31.97,  31.66
>>> 
>>> (Note: The machine had been up for over 2.75 days before
>>> starting this and had not been building much of anything
>>> during that time.)
>>> 
>>> I've not yet experimented with having other, significant
>>> competing activity.
>>> 
>>>>>>> 2. Run "make buildworld".
>>>>>> So on the 32 hardware-thread (16 cores) amd64 machine that
>>>>>> I have access to, the test is to only have buildworld use
>>>>>> about one hardware thread, no matter what else is going on.
>>>>>> I never would have guessed that the steps would not involve
>>>>>> more like -j$(sysctl -n hw.ncpu) (so around -j32 in this
>>>>>> context). So it is good that you provided your note or
>>>>>> I'd not know if I'd done similarly or not when trying such.
>>>>>> [Note: -j1 and lack of -j are not strictly equivalent in
>>>>>> how make operates. As I remember, the distinction makes
>>>>>> a notable difference in the number of subprocesses created
>>>>>> directly by make (one per action "line" vs. one for the
>>>>>> whole block?). So even using -j1 might make a difference
>>>>>> vs. what you specified. I'd have to test to see.]
>>>>> 
>>>>> I am literally running "make buildworld" with no additional options.
>>>> 
>>>> So required for repeating your results, but likely making
>>>> such results not be interesting relative to how I normally
>>>> deal with buildworld buildkernel and the likel, no matter
>>>> if there is other activity in an overlapping time frame or
>>>> not: my time preferences are too strong to wait for a single
>>>> hardware thread to do my normal builds, even with no
>>>> competing activity on the builder.
>>>> 
>>>>>>> Standard out conveniently reports how long it took (wall clock).
>>>>>> But nothing in your instructions indicate about how
>>>>>> to get an idea much progress dnetc made during the
>>>>>> various tests? [...]
>>>>> 
>>>>> Honestly, I've never worried about this part.  But dnetc logs its
>>>>> progress in /usr/local/distributed.net/dnetc.txt, though not in terms
>>>>> that are easy to relate to real-world progress.  Oddly, when I run
>>>>> "make buildworld," I'm primarily interested in getting the world built.
>>>>> Perhaps others feel differently.
>>>> 
>>>> Off topic for the specifics of the actual benchmark
>>>> that you run:
>>>> 
>>>> Then why not use of -jN ? In my context, any buildworld
>>>> using -j1 or no -j at all takes a huge amount of time
>>>> longer than letting it use all the hardware threads (or
>>>> so). (I've avoided having any I/O bound contexts for
>>>> such.) It does not take additional load on the system
>>>> for that to be true --including on the 4-core small arm
>>>> boards when I happen to buildworld on such (rare).
>>>> 
>>>> 
>>>>>> [...]
>>>>>> FYI: I've never built with and run the alternate
>>>>>> scheduler so if there is any appropriate background
>>>>>> for that that would not be obvious on finding basic
>>>>>> instructions, it would be appropriate to provide
>>>>>> such notes.
>>>>>> [...]
>>>>> 
>>>>> You have to build a new kernel, using a config file in which you have
>>>>> replaced "options SCHED_ULE" with "options SCHED_4BSD".     -- George
>>>> 
>>>> Thanks for the notes.
>>>> 
>>>> I've not decided if I'll do anything with the binary
>>>> blob or not.
>>> 
>> 
>> FYI:
>> 
>> It is not your specific experiment, but I started my
>> "extra load" experimenst with . . .
>> 
>> I started a -j32 buildworld buildkernel with dnetc still
>> running. I'm generally seeing around 55% Active and 42%
> 
> Note "Active": user, sorry.
> 
>> nice, < 2% system (it was building libllvm at this point).
>> At that time:
>> 
>> load averages:  64.41,  60.52,  49.81 MaxObs:  64.47,  60.52,  49.81
>> 
> 
> Contrasting results for some obj-lib32 build activity:
> much more variety of User, nice, and system, including
> times with < 5% user, 90+% nice. But not typical overall.
> But lots of time roughly around 50%/50% or 35%/60%. There
> were times with 15+% system.
> 
> Somewhat after buildkernel started:
> 
> load averages:  69.15,  64.12,  58.72 MaxObs:  75.98,  64.12,  58.72
> 
> Harder to summarize, so overall timing reports from the
> buildworld and buildkernel stages.
> 
> 
> buildworld:
> 
> --------------------------------------------------------------
> ... World build completed on Wed Mar 22 16:37:57 PDT 2023
> ... World built in 2615 seconds, ncpu: 32, make -j32
> --------------------------------------------------------------
> 
> 
> buildkernel:
> 
> --------------------------------------------------------------
> ... Kernel build for GENERIC-NODBG completed on Wed Mar 22 16:43:10 PDT 2023
> --------------------------------------------------------------
> ... Kernel(s)  GENERIC-NODBG built in 311 seconds, ncpu: 32, make -j32
> --------------------------------------------------------------
> 
> Afterwards:
> 
> load averages:  36.08,  53.14,  55.79 MaxObs:  75.98,  65.77,  59.84
> 
> 
> I then did (not all in the same window):
> 
> $ sudo service dnetc onestop
> # rm -fr /usr/obj/BUILDs/main-amd64-nodbg-clang-alt/usr/
> 
> before another -j32 buildworld buildkernel (no dnetc). The
> reuslts for this were:
> 
> 
> buildworld:
> 
> --------------------------------------------------------------
> ... World build completed on Wed Mar 22 17:39:19 PDT 2023
> ... World built in 1240 seconds, ncpu: 32, make -j32
> --------------------------------------------------------------
> 
> (compared to the 2615 for dnetc also in use)
> 
> 
> buildkernel:
> 
> --------------------------------------------------------------
> ... Kernel build for GENERIC-NODBG completed on Wed Mar 22 17:41:17 PDT 2023
> --------------------------------------------------------------
> ... Kernel(s)  GENERIC-NODBG built in 118 seconds, ncpu: 32, make -j32
> --------------------------------------------------------------
> 
> (compared to the 311 for dnetc also in use)

I forgot to show the MaxObs load averages for the no-dnetc
context:

MaxObs:  39.77,  32.15,  25.75

> Experiments without -j32 will take a lot longer, even
> without dnetc in use. I'm not sure there will be such
> results today.
> 


===
Mark Millard
marklmi at yahoo.com


===
Mark Millard
marklmi at yahoo.com