Realtime thread priorities

John Baldwin jhb at
Fri Dec 10 15:52:10 UTC 2010

So I finally had a case today where I wanted to use rtprio but it doesn't seem 
very useful in its current state.  Specifically, I want to be able to tag 
certain user processes as being more important than any other user processes 
even to the point that if one of my important processes blocks on a mutex, the 
owner of that mutex should be more important than sshd being woken up from 
sbwait by new data (for example).  This doesn't work currently with rtprio due 
to the way the priorities are laid out (and I believe I probably argued for 
the current layout back when it was proposed).

The current layout breaks up the global thread priority space (0 - 255) into a 
couple of bands:

  0 -  63 : interrupt threads
 64 - 127 : kernel sleep priorities (PSOCK, etc.)
128 - 159 : real-time user threads (rtprio)
160 - 223 : time-sharing user threads
224 - 255 : idle threads (idprio and kernel idle procs)

The problem I am running into is that when a time-sharing thread goes to sleep 
in the kernel (waiting on select, socket data, tty, etc.) it actually ends up 
in the kernel priorities range (64 - 127).  This means when it wakes up it 
will trump (and preempt) a real-time user thread even though these processes 
nominally have a priority down in the 160 - 223 range.  We do drop the kernel 
sleep priority during userret(), but we don't recheck the scheduler queues to 
see if we should preempt the thread during userret(), so it effectively runs 
with the kernel sleep priority for the rest of the quantum while it is in 

My first question is if this behavior is the desired behavior?  Originally I 
think I preferred the current layout because I thought a thread in the kernel 
should always have priority so it can release locks, etc.  However, priority 
propagation should actually handle the case of some very important thread 
needing a lock.  In my use case today where I actually want to use rtprio I 
think I want different behavior where the rtprio thread is more important than 
the thread waking up with PSOCK, etc.

If we decide to change the behavior I see two possible fixes:

1) (easy) just move the real-time priority range above the kernel sleep 
priority range

2) (harder) make sched_userret() check the run queue to see if it should 
preempt when dropping the kernel sleep priority.  I think bde@ has suggested 
that we should do this for correctness previously (and I've had some old, 
unfinished patches to do this in a branch in p4 for several years).

John Baldwin

More information about the freebsd-arch mailing list