Not to beat a dead horse, but ...

Attilio Rao attilio at freebsd.org
Tue Jul 1 14:32:26 UTC 2014


On Sun, Jun 8, 2014 at 8:15 PM, George Mitchell <george+freebsd at m5p.com> wrote:
> When I run this command on 10-STABLE on a uniprocessor system while
> running the misc/dnetc port:
>
> cd /usr/src
> time make buildworld && time make buildkernel && time make installkernel
>
> On revision 266422 with SCHED_ULE, I get (showing the time lines only):
>
> 7045.988u 897.681s 4:00:33.89 55.0%     29430+492k 27927+17003io
> 30943pf+519w
> 1155.683u 149.422s 52:49.60 41.1%       25418+410k 7452+20843io 12166pf+248w
> 7.101u 4.838s 8:03.57 2.4%      5905+221k 1179+9461io 1345pf+67w
>
> On revision 267211 with SCHED_4BSD:
>
> 6950.087u 665.074s 2:40:36.19 79.0%     29929+502k 33651+17368io
> 31151pf+151w
> 1148.066u 134.312s 26:40.95 80.1%       26234+426k 9681+24613io 11917pf+106w
> 6.774u 4.369s 0:33.90 32.8%     3110+320k 1388+10979io 1514pf+3w
>
> Since the majority of my systems are uniprocessors and I like to
> run dnetc, SCHED_ULE has been a dealbreaker for me since day one.
> Consequently I can't use freebsd_update.

So I think that the problem here is that essentially dnetc behaves in
entirely different ways between the 2 systems, but you just don't care
about how much work it is able to carry during your buildworld
workload. To high-level description, it is like the CPU runtime is
partitioned in balanced way in the ULE case while for 4BSD there is a
huge bias toward the buildworld CPU%.

Both threads (actually set of threads for buildworld) all fall in the
time-share priority range and they are treacted all the same by the
scheduler.
However, differently from 4BSD, ULE has algorithms that essentially
adjust dynamically the priority of threads to calculate properly the
interactivity scores and that dynamically recalculate the thresholds
for the RR quantum for timeshare priority threads. The quantum
decreases proportionaly to the runqueue load. This essentially means
that more stuff you will push to the runqueue (and buildworld spawns
quite a bit of threads, in your schedgraph traces there were around
5/6 new) higher there will be the turnaround to properly partition the
CPU times between all these time-share priority thread.
It will also mean there will be much more context switches than the 4BSD case.

In the 4BSD case, instead, the RR-quantum remains essentially fixed. I
cannot say for sure because I don't know its code, but I expect that
dnetc has some provisioning to perform manually some yielding after a
"fraction" of the expected RR-quantum is used. So I expect this
computational time to be smaller than 100ms (quantum default time
slice for 4BSD).

To get out of this situation and prove that what I'm saying is right
you can try 2 different things:
- Renice the buildworld to get it out of the timeshare-priority area
but bring it into the kerne/real-time area. I suspect this will make
dnetc to essentially perform very little job, but I expect you don't
really care. However it will also make the workload compete against
kernel services.
- Enlarge your RR quantum for timeshare priority threads. You can do
that via the kern.sched.quantum sysctl. I think that can aim for
200ms. I think this should be the preferred case.

Of course, please keep disabiling the SMP option for your uniprocessor kernel.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein


More information about the freebsd-stable mailing list