cpuset and affinity implementation

Jeff Roberson jroberson at chesapeake.net
Tue Feb 26 07:37:38 UTC 2008


On Tue, 26 Feb 2008, Daniel Eischen wrote:

> On Mon, 25 Feb 2008, Jeff Roberson wrote:
>
>> On Mon, 25 Feb 2008, Daniel Eischen wrote:
>> 
>>> On Mon, 25 Feb 2008, Jeff Roberson wrote:
>>> 
>>>> On Mon, 25 Feb 2008, Daniel Eischen wrote:
>>>> 
>>>>> On Mon, 25 Feb 2008, Jeff Roberson wrote:
>>>>> 
>>>>>> On Mon, 25 Feb 2008, Alfred Perlstein wrote:
>>>>>> 
>>>>>>> Jeff, this is very cool.  I do have one issue though:
>>>>>>> 
>>>>>>> + * A thread may not be assigned to a a group seperate from other 
>>>>>>> threads in
>>>>>>> + * the process.  This is to remove ambiguity when the setid is 
>>>>>>> queried with
>>>>>>> + * a pid argument.  There is no other technical limitation.
>>>>>>> 
>>>>>>> Am I understanding things correctly such that within a process there
>>>>>>> can only be one "set"?
>>>>>>> 
>>>>>>> If so this restricts some of the benifits you get with sets and
>>>>>>> binding.
>>>>>>> 
>>>>>>> An example would be some sort of system with multiple CPUs where some
>>>>>>> are assigned specifically for pseudo-realtime processing and others 
>>>>>>> are for
>>>>>>> general control things such as cli, stats, etc.
>>>>>>> 
>>>>>>> In our case we would like to be able to run some threads on specific
>>>>>>> cpu sets, and other threads to be run anywhere on the control CPUs.
>>>>>>> 
>>>>>>> Can this be done with this API?
>>>>>> 
>>>>>> Individual threads can be bound to any cpu or group of cpus within the 
>>>>>> set. So if you just make a set that includes all cpus in the system you 
>>>>>> can then bind your realtime threads to specific cpus and the other 
>>>>>> threads to the remainder.  You will have to specifically bind each 
>>>>>> thread however.
>>>>>> 
>>>>>> The reason individual threads can't be assigned to groups is because 
>>>>>> cpuset_getid() for a pid wouldn't make sense then and I expect 
>>>>>> administrators to be mostly interested in managing groups of processes.
>>>>> 
>>>>> If the administrator sets up a set of CPUs specifically for
>>>>> real-time and another set for non-real-time, you may want to
>>>>> bind some threads to the real-time set, and leave other threads
>>>>> unbound (or even bound to the non-real-time set).  In this
>>>>> case, I think cpuset_getid() should either return the default
>>>>> cpuset of all cpus in the system, or the last cpuset to
>>>>> which the process was bound.
>>>>> 
>>>>> But regardless, I think binding a thread to a different
>>>>> processor set should be allowed and should override its
>>>>> inherent binding of the process' processor set.
>>>>> Hmm, I guess in this case, a subsequent binding of the
>>>>> process to a processor set should probably override any
>>>>> per-thread bindings.
>>>> 
>>>> I think we're getting into complex corner cases here which will only 
>>>> confuse the api and administrators.  I don't expect administrators will 
>>>> want to set groups to individual threads.  How would he even identify the 
>>>> individual thread?  And if he did, he could just as easily set masks on 
>>>> that thread along with others in the process.
>>>> 
>>>> I'm already a little nervous about how complicated this will be for 
>>>> programmers.   If we allowed each thread in a pid to be in its own set, 
>>>> we'd have to make cpuset_getid() return an array of ids.  I definitely 
>>>> don't want to do that.
>>> 
>>> Solaris does seem to allow this BTW.
>> 
>> Solaris also doesn't allow a processor to be in more than one set.  It 
>> doesn't allow a thread to bind to a processor that's in a processor set. 
>> It also doesn't seem provide a mechanism to query the set that a thread is 
>> in, so there is no ambiguity for the querying.  However, when you modify 
>> you have the option of retrieving the old set.  They must simply return the 
>> first one discovered.  We could do that but it doesn't seem very 
>> attractive.
>
> Probably, we should just return the last processor set that
> was bound to the process (using the default processor set if
> there was none).  I would disregard any LWP/thread-specific
> bindings when returning the processor set for the process.
> Everyone should know by know that there are threads to
> consider, and if they want more specific information to
> query the processor bindings for each thread as well as
> for the process.

Binding a processor set to the process simply sets the per-thread binding 
of each thread in the process.  There is otherwise no specific process 
binding.  We could keep a pointer to the last specifically bound set in 
the process if we wanted, but what would it be used for other than 
querying the id of the process?  What if each thread was seperately 
specifically bound to a different set?  What set should be used on fork? 
The set of the process or the thread that called fork?  What about when 
creating a new thread?

>
> The Solaris man page for pset_bind does say that it binds
> all LWPs of the process when the argument is the PID.  That
> seems to indicate that it will override any LWP-specific
> bindings.

Yes, same with the current cpuset design.

>
>> Would people be in favor of binding threads to sets if it meant getting the 
>> id from a pid was not always 100% accurate?  Even though a thread may 
>> already restrict its set?
>
> I think it would be accurate if we really returned the set
> for the process, disregarding thread-specific bindings.  As
> long as it is worded correctly, I don't think it would be
> wrong.  The only ambiguity might be if there was no explicit
> per-process binding, but there was thread-specific bindings.
> Even in this case though, if you returned the default cpuset,
> I think it would still be accurate.

See above discussion.  I'm not sure what you mean by 'default' cpuset 
here.

>
>> From the pset_assign man page:
>> 
>> "Processors with LWPs bound to them using processor_bind(2) cannot be 
>> assigned to a new processor set. If this is attempted, pset_assign() will 
>> fail and set errno to EBUSY."
>> 
>> My cpuset design seems to be a lot more flexible.
>
> I think it is because that older Solaris had only specific
> processor bindings.  Newer versions of Solaris added processor
> sets.  I don't think we would want this restriction :-)

Yeah, they started with the simplest interface and started adding more 
complex and incompatible interface.  They now have pools, sets, and 
binding none of which are compatible with each other.  In fact if you 
enable pools it disables sets.  And specific binding precludes both.

I also looked at the linux implementation.  It uses a filesystem to store 
and manipulate set information.  It also seems to allow arbitrary binding 
and sets as we have, however, the distributed version is said not to allow 
modifying the set while live.  They call this migration.  I think the 
filesystem interface is inelegant but it's similar in features to 
our cpuset.

Jeff

>
> -- 
> DE
>


More information about the freebsd-arch mailing list