Realtime thread priorities

Tue Dec 14 05:33:46 UTC 2010

Sergey Babkin wrote:
> John Baldwin wrote:
>> On Sunday, December 12, 2010 3:06:20 pm Sergey Babkin wrote:
>>> John Baldwin wrote:
>>>> The current layout breaks up the global thread priority space (0 - 255)
>> into a
>>>> couple of bands:
>>>>
>>>>   0 -  63 : interrupt threads
>>>>  64 - 127 : kernel sleep priorities (PSOCK, etc.)
>>>> 128 - 159 : real-time user threads (rtprio)
>>>> 160 - 223 : time-sharing user threads
>>>> 224 - 255 : idle threads (idprio and kernel idle procs)
>>>>
>>>> If we decide to change the behavior I see two possible fixes:
>>>>
>>>> 1) (easy) just move the real-time priority range above the kernel sleep
>>>> priority range
>>> Would not this cause a priority inversion when an RT process
>>> enters the kernel mode?
>> How so?  Note that timesharing threads are not "bumped" to a kernel sleep
>> priority when they enter the kernel either.  The kernel sleep priorities are
>> purely a way for certain sleep channels to cause a thread to be treated as
>> interactive and give it a priority boost to favor interactive threads.
>> Threads in the kernel do not automatically have higher priority than threads
>> not in the kernel.  Keep in mind that all stopped threads (threads not
>> executing) are always in the kernel when they stop.
> 
> I may be a bit behind the times here. But historically the "default"
> process priority means the priority when the process was pre-empted.
> If it did a system call, the priority on wake up would be as
> specified in the sleep() kernel function (or its more modern
> analog, like a sleeplock or condition variable). This would 
> let the kernel code react quickly, and then on return from 
> the syscall revert to the original priority, and possibly 
> get pre-empted by another process at that time. 

Agree, when a thread is woken up, it means kernel has some events
the thread needs to process.

> 
> If the user-mode priority is higher than the kernel-mode priority,
> this would mean that once a high priority process does a system
> call (say for example, poll()), it would experience a priority
> inversion and sleep with a lower priority than specified.
> 
> A fix for this should be fairly straightforward. The process structure
> has the RT priority in it, so all that sleep() need is to check 
> it and use that priority if it's higher than the one given
> as an argument. Or optionally, for the RT processes bump the argument
> by how much the process'es RT priority is over the "RT baseline".
> (Well, "logically over", numerically under).
> 
> This would not solve the more general classic priority inversion
> issue with some low-priority process grabbing some kernel 
> resource and sleeping at a lower priority while then an RT process
> waits for this resource to be freed. I think the original
> idea of in-kernel processes having the higher priorities is in
> part an attempt to answer this problem. But I agree with you that
> letting the RT processes have a higher priority than the TS processes
> in the kernel mode is better than nothing.
> 
Is there a way to indicate the current thread is in critical section,
and should not be preempted util it is blocked ? once it is resumed,
it still runs at higher priority than RT.

> Um, a stupid question: does the signal() primitive on mutexes/condvars
> (i.e. "wake up one sleeper") pick the thread with the highest 
> priority? I guess it should, since otherwise the classic 
> priority inversion can get pretty bad. But you can tell from
> what I say, for how long I haven't looked at that code, so I 
> don't know the answer.
> 
mutex/condvar does not have queue migrating or wakeup deferring,
once a thread is woken up, if it is scheduled to run, it will
immediately spin, but this just wastes time, because the owner
may have not released the mutex yet, and if the mutex owner is
preempted, things get worse.

> -SB