kern/129164: Wrong priority value for normal processes

Unga unga888 at
Fri Nov 28 04:18:41 PST 2008

--- On Fri, 11/28/08, Bruce Evans <brde at> wrote:

> From: Bruce Evans <brde at>
> Subject: Re: kern/129164: Wrong priority value for normal processes
> To: "David Xu" <davidxu at>
> Cc: "Jeff Roberson" <jroberson at>, "Unga" <unga888 at>, yanefbsd at, freebsd-bugs at, jeff at, brde at, jhb at
> Date: Friday, November 28, 2008, 6:48 PM
> On Fri, 28 Nov 2008, David Xu wrote:
> > This might be caused by following code in sched_ule.c:
> No, this is mostly user confusion -- see a previous reply.
> > static void
> > sched_priority(struct thread *td)
> > {
> >        int score;
> >        int pri;
> > 
> >        if (td->td_pri_class != PRI_TIMESHARE)
> >                return;
> >        /*
> >         * If the score is interactive we place the
> thread in the realtime
> >         * queue with a priority that is less than
> kernel and interrupt
> >         * priorities.  These threads are not subject
> to nice restrictions.
> >         *
> >         * Scores greater than this are placed on the
> normal timeshare queue
> >         * where the priority is partially decided by
> the most recent cpu
> >         * utilization and the rest is decided by nice
> value.
> >         *
> >         * The nice value of the process has a linear
> effect on the calculated
> >         * score.  Negative nice values make it easier
> for a thread to be
> >         * considered interactive.
> >         */
> >        score = imax(0, sched_interact_score(td) -
> td->td_proc->p_nice);
> >        if (score < sched_interact) {
> >                pri = PRI_MIN_REALTIME;
> >                pri += ((PRI_MAX_REALTIME -
> PRI_MIN_REALTIME) / sched_interact)
> >                    * score;
> >                KASSERT(pri >= PRI_MIN_REALTIME
> && pri <= PRI_MAX_REALTIME,
> >                    ("sched_priority: invalid
> interactive priority %d score %d",
> >                    pri, score));
> >        } else {
> > 
> > 
> > it uses PRI_MIN_REALTIME, then it calls
> sched_user_prio(td, pri) which sets td_base_user_pri and
> td_user_pri, and causes td_user_pri and td_base_user_pri to
> be out of range.
> They are out of normal range for a PRI_TIMESHARE user
> thread, but this is
> intentional -- the thread is supposed to temporarily act
> like a
> PRI_REALTIME one -- see the comment.
> No, they should be the PRI_REALTIME limits like they are.
> The user confusion is that the garbage returned by
> rtprio(RTP_LOOKUP,
> ...) for PRI_TIMESHARE processes is interpreted as a
> realtime priority.
> The garbage was originally 0, but is now supposed to be
> (for no good
> reason) the current base user priority.  In either case, it
> has very
> little to do with the realtime priority, so I call it
> garbage.  Its
> upper limit has always been out of bounds for a realtime
> priority, and
> the above code makes it go negative and thus its lower
> limit is out
> of bounds for a realtime priority too.  Since realtime
> priorities are
> unsigned, going below the lower limit just gives more
> obvious garbage
> by misrepresenting a negative value in a u_short.

Hi Bruce

The rtprio(2) is implemented in /usr/src/sys/kern/kern_resource.c as rtprio_thread().

By looking at the rtprio(2) implementation, it is clear the author of the rtprio(2) intended to set Realtime, Normal and Idletime priorities and read the original priority value (ie. what value was set) of Realtime, Normal and Idletime processes. The rtprio(2) is not intended to be limited only to the Realtime and Idletime classes.

The rtprio(2) sets the priority class (Realtime, Normal or Idletime) in td_pri_class and sets the priority value in td_base_user_pri of the "thread" structure defined in /usr/include/sys/proc.h. When rtprio(2) reads priority, it reads both the class and the value from td_pri_class and td_base_user_pri, respectively.

That is, rtprio(2) expects the original class and value do not change while sheduling. This expectation is now broken.

David Xu pointed out one way how the td_base_user_pri get changed by the sched_ule.

In my understanding, the "thread" structure should carry the original priority class and value without change for any system call to be referenced at any time.

The original priority value and running priority value are two different. The original priority value should be static, means normally should not change and the running priority value can vary from value 255 (most idle) to 0 (highest priority). 

The value of running priority may be useful for scheduler debugging, other than that it lasts only a fraction of a second, its that transient.

What is most important is to know the original priority class and value. This is useful for cases where an user wants to organize various processes to different priority classes. Some processes he wants to bring to Realtime class, and some processes he wants to run at Normal priority and processes he wants to bring to Idletime category. I'm one such user. One example of use is, run JACK in realtime, Firefox in normal and Bittorrent in Idletime. Once assign processes to various priority classes, one needs to check are they in the intended categories. That's why one needs to inspect the original priority class and the value. By the time you check, depends on the load, the Firefox browser may be running even in Realtime for its to gets executed. Next moment Firefox comes back to its normal.

So the implementors of sched_ule should clarify if the td_base_user_pri is now dynamic, which field of "thread" structure now carry the original priority value. Or if they made a mistake by overlooking the expectation of rtprio(2), its best if implementors of sched_ule could fix it. Of course, anybody else who understand the sched_ule could look in to it.

Please note, I'm not a native English speaker, I have no intention of forcing developers to do things, please forgive me if anybody feels so. This is a kind explanation.

Kind regards


More information about the freebsd-bugs mailing list