svn commit: r219003 - head/usr.bin/nice
Bruce Evans
brde at optusnet.com.au
Thu Feb 24 22:23:01 UTC 2011
On Fri, 25 Feb 2011, Bruce Evans wrote:
> On Thu, 24 Feb 2011, John Baldwin wrote:
>
>> On Thursday, February 24, 2011 2:03:33 pm Remko Lodder wrote:
>>>
> [contex restored:
> +A priority of 19 or 20 will prevent a process from taking any cycles from
> +others at nice 0 or better.]
>
>>> On Feb 24, 2011, at 7:47 PM, John Baldwin wrote:
>>>
>>>> Are you sure that this statement applies to both ULE and 4BSD? The two
>>>> schedulers treat nice values a bit differently.
>>>
>>> No I am not sure that the statement applies, given your response I
>>> understand
>>> that both schedulers work differently. Can you or David tell me what the
>>> difference
>>> is so that I can properly document it? I thought that the tool is doin the
>>> same for all
>>> schedulers, but that the backend might treat it differently.
>
> I'm sure that testing would show that it doesn't apply in FreeBSD. It is
> supposed to apply only approximately in FreeBSD, but niceness handling in
> FreeBSD is quite broken so it doesn't apply at all. Also, the magic numbers
> of 19 and 20 probably don't apply in FreeBSD. These were because there
> nicenesses that are the same mod 2 (maybe after adding 1) have the same
> effect, since priorities that are the same mode RQ_PPQ = 4 have the same
> effect and the niceness space was scaled to the priority space by
> multiplying by NICE_WEIGHT = 2. But NICE_WEIGHT has been broken to be 1
> in FreeBSD with SCHED_4BSD and doesn't apply with SCHED_ULE. With
> SCHED_4BSD, there are 4 (not 2) nice values near 20 that give the same
> behaviour.
>
> It strictly only applies to broken schedulers. Preventing a process
> from taking *any* cycles gives priority inversion livelock. FreeBSD
> has priority propagation to prevent this.
Just tried it with SCHED_4BSD. On a multi-CPU system (ref9-i386), but
I think I used cpuset correctly to emulate 1 CPU.
% last pid: 85392; load averages: 1.71, 0.86, 0.38 up 94+01:00:36 21:55:59
% 66 processes: 3 running, 63 sleeping
% CPU: 6.9% user, 3.7% nice, 2.0% system, 0.0% interrupt, 87.3% idle
% Mem: 268M Active, 4969M Inact, 310M Wired, 50M Cache, 112M Buf, 2413M Free
% Swap: 8192M Total, 580K Used, 8191M Free
%
% PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
% [... system is not nearly idle, but plenty of CPUs to spare]
% 85368 bde 1 111 0 9892K 1312K RUN 1 1:07 65.67% sh
% 85369 bde 1 123 20 9892K 1312K CPU1 1 0:35 37.89% sh
This shows the bogus 1:2 ratio even for a niceness difference of 20. I've
seen too much of this ratio. IIRC, before FreeBSD-4 was fixed, the
various nonlinearities caused by not even clamping, combined with the
broken scaling, gave a ratio of about this. Then FreeBSD-5 restored
a similarly bogus ratio. Apparently, the algorithm for decaying p_estcpu
in SCHED_4BSD tends to generate this ratio. SCHED_ULE uses a completely
different algorithm and I think it has more control over the scaling, so
it is surprising that it duplicates this brokenness so perfectly.
And here is what it does with more nice values: this was generated by:
% for i in 0 2 4 6 8 10 12 14 16 18 20
% do
% cpuset -l 1 nice -$i sh -c "while :; do echo -n;done" &
% done
% top -o time
% last pid: 85649; load averages: 10.99, 9.06, 5.35 up 94+01:19:33 22:14:56
% 74 processes: 12 running, 62 sleeping
%
% Mem: 270M Active, 4969M Inact, 310M Wired, 50M Cache, 112M Buf, 2411M Free
% Swap: 8192M Total, 580K Used, 8191M Free
%
%
% PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
% 85581 bde 1 98 0 9892K 1312K RUN 1 0:48 11.47% sh
% 85582 bde 1 100 2 9892K 1312K RUN 1 0:45 10.69% sh
% 85583 bde 1 102 4 9892K 1312K RUN 1 0:42 10.35% sh
% 85584 bde 1 104 6 9892K 1312K CPU1 1 0:40 9.47% sh
% 85585 bde 1 106 8 9892K 1312K RUN 1 0:38 8.79% sh
% 85586 bde 1 108 10 9892K 1312K RUN 1 0:36 8.06% sh
% 85587 bde 1 110 12 9892K 1312K RUN 1 0:34 8.40% sh
% 85588 bde 1 111 14 9892K 1312K RUN 1 0:33 8.50% sh
% 85589 bde 1 113 16 9892K 1312K RUN 1 0:31 7.67% sh
% 85590 bde 1 115 18 9892K 1312K RUN 1 0:30 7.28% sh
% 85591 bde 1 117 20 9892K 1312K RUN 1 0:29 6.69% sh
This is OK except for the far-too-small dynamic range of 29:48 (even worse
than 1:2).
My version spaces out things nicely according to its table:
% last pid: 1374; load averages: 11.02, 8.74, 4.93 up 0+02:26:12 09:16:47
% 43 processes: 12 running, 31 sleeping
% CPU: 14.0% user, 85.7% nice, 0.0% system, 0.4% interrupt, 0.0% idle
% Mem: 35M Active, 23M Inact, 67M Wired, 24K Cache, 61M Buf, 876M Free
% Swap:
%
% PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
% 1325 root 1 120 0 856K 572K RUN 2:18 28.52% sh
% 1326 root 1 120 2 856K 572K RUN 1:39 19.97% sh
% 1327 root 1 120 4 856K 572K RUN 1:10 13.96% sh
% 1328 root 1 120 6 856K 572K RUN 0:50 9.72% sh
% 1329 root 1 123 8 856K 572K RUN 0:36 7.18% sh
% 1330 root 1 123 10 856K 572K RUN 0:25 5.03% sh
% 1331 root 1 124 12 856K 572K RUN 0:18 2.93% sh
% 1332 root 1 124 14 856K 572K RUN 0:13 1.86% sh
% 1333 root 1 124 16 856K 572K RUN 0:09 0.98% sh
% 1334 root 1 124 18 856K 572K RUN 0:06 1.07% sh
% 1335 root 1 123 20 856K 572K RUN 0:05 0.15% sh
The dynamic range here is 5:138. Not as close to the table's 1:32 as
I would like.
Bruce
More information about the svn-src-all
mailing list