kern/129164: Wrong priority value for normal processes

Bruce Evans brde at optusnet.com.au
Tue Nov 25 07:09:18 PST 2008


On Tue, 25 Nov 2008, Bruce Evans wrote:

> On Tue, 25 Nov 2008, Garrett Cooper wrote:
>
>> On Nov 25, 2008, at 12:31 AM, Unga wrote:
>> > FreeBSD grey.lan 7.0-STABLE FreeBSD 7.0-STABLE Sun May 25 2008 i386
>> >> Description:
>> > The priority value for root and other normal processes is 65504
>> > (rtp.prio) where zero (0) is expected.
>
> This value is unexpected, but rtp.prio is meaningless for normal
                 actually, a garbage value is expected
> processes -- the priority (nice value) is given by getpriority(2), not
> by rtprio(2).  rtp.prio gives the realtime or idletime priority for
> realtime and idletime processes, respectively.  It is meaningless for
> normal processes, but rtprio(RTP_LOOKUP, ...) bogusly succeeds and is

>> > Maximum priority value for normal priority processes can take is 20,
>> > not 65504. Normal priority processes are expected to run at priority
>> > zero (0) as it is specified in /etc/login.conf under login class
>> > "default".
>
> For normal priorities, the range is [-20, 20].
>
> For rtp.prio, the range is [0, 31].

And for the dynamic priority of normal processes, after subtracting the
PRI_TIMESHARE bias of 160, it is supposed to be [0, 63].  It apparently
is this for SCHED_4BSD, but for SCHED_ULE it apparently goes down to
-32, with -32 the normal value for an almost-idle process.  This might
cause problems since the range of [-32, -1] is reserved for realtime
processes.

> 65504 is probably -32 misrepresented as an unsigned short.
>
> I can't see any problems with the range being subtracted -- there used to
> be problems when td_base_pri was used (since this value became wrong
> and possibly out of bounds with priority propagation), but these were
> fixed by using td_base_user_pri (in 6.x IIRC -- 7.0 has these fixes).

Now I see it.  I missed calls to sched_user_priority() in sched*.c.
These are necessary and now made in the PRI_TIMESHARE case only to
update td_base_user_pri to the current (dynamic) user priority so that
after priority propagation the previous user priority can be restored
(old broken code always restored PUSER).  So td_base_user_pri is now
very dynamic -- it gives close to the current user priority (which for
some reason isn't used directly for priority propagation).  Interpreting
it as a realtime priority gives garbage.  The normal (biased) garbage
value for an almost-idle process is 20 for SCHED_4BSD and -32 for
SCHED_ULE (run the test program a few times to see different values,
or put a loop in the program to see consistently higher values).  It
used to be 0.  -32 is out of bounds so for both the rtprio range and
for an unsigned short so the value is more obviously garbage for
SCHED_ULE.  0 looks normal so the garbage used to be less obvious.

The only bug here is returning garbage for rtprio(RTP_LOOKUP, ...) in
non-rt/idprio cases, or perhaps in applications using the value in
such cases.  Programs like ps do avoid using the value, but have other
bugs.  I just noticed some of these:

>From ps.1:

% .It Cm rtprio
% realtime priority (101 = not a realtime process)

ps was changed long ago to print "normal" instead of this magic number.

>From print.c:

% void
% priorityr(KINFO *k, VARENT *ve)
% {
% 	VAR *v;
% 	struct priority *lpri;
% 	char str[8];
% 	unsigned class, level;
% 
% 	v = ve->var;
% 	lpri = &k->ki_p->ki_pri;
% 	class = lpri->pri_class;
% 	level = lpri->pri_level;

pri_level is a very wrong field to use here.  It gives the current
(dynamic) priority for all processes.  The rtprio fields are unfortunately
not available in userland.  They were lost on 2001/02/12.

% 	switch (class) {
% 	case PRI_ITHD:
% 		snprintf(str, sizeof(str), "intr:%u", level);
% 		break;

ithreads don't have an rtprio level.  Printing their current priority
instead is reasonable but is confusing since in all other places this
priority is printed after subtracting a bias of PZERO.

% 	case PRI_REALTIME:
% 		snprintf(str, sizeof(str), "real:%u", level);
% 		break;

Broken like PRI_IDLE.

% 	case PRI_TIMESHARE:
% 		strncpy(str, "normal", sizeof(str));
% 		break;

OK, since there is no rtp.prio to print.

% 	case PRI_IDLE:
% 		snprintf(str, sizeof(str), "idle:%u", level);
% 		break;

Kernel idle processes have a current priority of 255 (unless undergoing
priority propagation) and an rtp.prio of 31.  The above misprints this
as "idle:25", where 25 is 255 truncated (snprintf()'s return value is
of course ignored, so this error is not detected).

% 	default:
% 		snprintf(str, sizeof(str), "%u:%u", class, level);
% 		break;
% 	}
% 	str[sizeof(str) - 1] = '\0';
% 	(void)printf("%*s", v->width, str);
% }

Bruce


More information about the freebsd-bugs mailing list