Permit init(8) use its own cpuset group.
Alexander V. Chernikov
melifaro at FreeBSD.org
Thu Jun 5 09:15:31 UTC 2014
On 05.06.2014 04:31, Konstantin Belousov wrote:
> On Wed, Jun 04, 2014 at 11:16:59PM +0400, Alexander V. Chernikov wrote:
>> On 04.06.2014 19:06, John Baldwin wrote:
>>> On Monday, June 02, 2014 12:48:50 pm Konstantin Belousov wrote:
>>>> On Mon, Jun 02, 2014 at 06:52:10PM +0400, Alexander V. Chernikov wrote:
>>>>> Hello list!
>>>>>
>>>>> Currently init(8) uses group 1 which is root group.
>>>>> Modifications of this group affects both kernel and userland threads.
>>>>> Additionally, such modifications are impossible, for example, in presence
>>>>> of multi-queue NIC drivers (like igb or ixgbe) which binds their threads
>>> to
>>>>> particular cpus.
>>>>>
>>>>> Proposed change ("init_cpuset" loader tunable) permits changing cpu
>>>>> masks for
>>>>> userland more easily. Restricting user processes to migrate to/from CPU
>>>>> cores
>>>>> used for network traffic processing is one of the cases.
>>>>>
>>>>> Phabricator: https://phabric.freebsd.org/D141 (the same version attached
>>>>> inline)
>>>>>
>>>>> If there are no objections, I'll commit this next week.
>>>> Why is the tunable needed ?
>>> Because some people already depend on doing 'cpuset -l 0 -s 1'. It is also
>>> documented in our manpages that processes start in cpuset 1 by default so
>>> that you can use 'cpuset -l 0 -s 1' to move all processes, etc.
>>>
>>> For the stated problem (bound ithreads in drivers), I would actually like to
>>> fix ithreads that are bound to a specific CPU to create a different cpuset
>>> instead so they don't conflict with set 1.
>> Yes, this seem to be much better approach.
>> Please take a look on the new patch (here or in the phabricator).
>> Comments:
>>
>> Use different approach for modifyable root set:
>>
>> * Make sets in 0..15 internal
>> * Define CPUSET_SET1 & CPUSET_ITHREAD in that range
>> * Add cpuset_lookup_builtin() to retrieve such cpu sets by id
>> * Create additional root set for ithreads
>> * Use this set in ithread_create()
>> * Change intr_setaffinity() to use cpuset_iroot (do we really need this)?
>>
>> We can probably do the same for kprocs, but I'm unsure if we really need it.
> I think this is fine. See below for the style comments.
Patch updated.
>
> With the proliferation of the special cpuset ids, it is probably time
> to add an assert to cpuset_rel() and cpuset_rel_defer() that it never
> try to free the preallocated set.
Added.
It seems that cpuset_rel() / _cpuset_create() pair is racy:
Let's imagine we have set X with single reference and the following 2
processes are going to happen:
1) cpuset_rel() is called (for example, while changing process CPU set
from userland)
2) given process is going to fork creating shadow set
Then, the folowing can happen:
1 calls refcount_release returning 0
2 calls cpuset_shadow() -> _cpuset_create() which finishes successfully
adding another reference to set X and returning to caller
1 proceeds for set deletion removing it from list, releasing index and
freeing it via uma_zfree()
I'm not sure how we can fix this easily
(
well, we can read current refcount value directly inside cpuset_lock,
return if != 0 which is more or less safe,
but more complex cases like multiple cpuset_rel() running on the same
time (and calling refcount_release())
are still possible
)
>
> Since you use static pre-known set id for ithreads, it should be documented
> along with the set 1 in the man page ?
Yes, I'll do that.
>
>>
>> Index: sys/kern/kern_cpuset.c
>> ===================================================================
>> --- sys/kern/kern_cpuset.c
>> +++ sys/kern/kern_cpuset.c
>> @@ -112,7 +112,7 @@
>> SYSCTL_INT(_kern_sched, OID_AUTO, cpusetsize, CTLFLAG_RD,
>> 0, sizeof(cpuset_t), "sizeof(cpuset_t)");
>>
>> -cpuset_t *cpuset_root;
>> +cpuset_t *cpuset_root, *cpuset_iroot;
>>
>> /*
>> * Acquire a reference to a cpuset, all pointers must be tracked with refs.
>> @@ -213,7 +213,7 @@
>> * Find a set based on an id. Returns it with a ref.
>> */
>> static struct cpuset *
>> -cpuset_lookup(cpusetid_t setid, struct thread *td)
>> +cpuset_lookup_id(cpusetid_t setid)
>> {
>> struct cpuset *set;
>>
>> @@ -227,6 +227,17 @@
>> cpuset_ref(set);
>> mtx_unlock_spin(&cpuset_lock);
>>
>> + return (set);
>> +}
>> +
>> +static struct cpuset *
>> +cpuset_lookup(cpusetid_t setid, struct thread *td)
>> +{
>> + struct cpuset *set;
>> +
>> + set = cpuset_lookup_id(setid);
>> +
>> +
> Too many emtpy lines. The statement above should probably go after the
> KASSERT() line below.
>
>> KASSERT(td != NULL, ("[%s:%d] td is NULL", __func__, __LINE__));
>> if (set != NULL && jailed(td->td_ucred)) {
>> struct cpuset *jset, *tset;
>> @@ -245,6 +256,25 @@
>> }
>>
>> /*
>> + * Find a set based on an id. Returns it with a ref.
>> + */
>> +struct cpuset *
>> +cpuset_lookup_builtin(cpusetid_t setid)
>> +{
>> + struct cpuset *set;
>> +
>> + KASSERT(setid <= CPUSET_RESERVED_MAX,
>> + ("[%s:%d] wrong set id: %d", __func__, __LINE__, setid));
>> +
>> + set = cpuset_lookup_id(setid);
>> +
>> + KASSERT(set != NULL,
>> + ("[%s:%d] set id %d not found", __func__, __LINE__, setid));
>> +
>> + return (set);
> Again the empty lines are not needed.
>
>> +}
>> +
>> +/*
>> * Create a set in the space provided in 'set' with the provided parameters.
>> * The set is returned with a single ref. May return EDEADLK if the set
>> * will have no valid cpu based on restrictions from the parent.
>> @@ -725,9 +755,10 @@
>> * 1 - The default set which all processes are a member of until changed.
>> * This allows an administrator to move all threads off of given cpus to
>> * dedicate them to high priority tasks or save power etc.
>> + * 2 - The root set which should be used for all interruot/ithreads.
> Typo in the 'interrupt'.
>
>> */
>> -struct cpuset *
>> -cpuset_thread0(void)
>> +static void
>> +cpuset_init_sets(void)
>> {
>> struct cpuset *set;
>> int error;
>> @@ -751,14 +782,31 @@
>> * Now derive a default, modifiable set from that to give out.
>> */
>> set = uma_zalloc(cpuset_zone, M_WAITOK);
>> - error = _cpuset_create(set, cpuset_zero, &cpuset_zero->cs_mask, 1);
>> + error = _cpuset_create(set, cpuset_zero, &cpuset_zero->cs_mask,
>> + CPUSET_SET1);
>> KASSERT(error == 0, ("Error creating default set: %d\n", error));
>> /*
>> - * Initialize the unit allocator. 0 and 1 are allocated above.
>> + * Create another root set for interrupt bindings.
>> */
>> - cpuset_unr = new_unrhdr(2, INT_MAX, NULL);
>> + set = uma_zalloc(cpuset_zone, M_WAITOK);
>> + error = _cpuset_create(set, cpuset_zero, &cpuset_zero->cs_mask,
>> + CPUSET_ITHREAD);
>> + KASSERT(error == 0, ("Error creating ithread cpuset: %d\n", error));
>> + set->cs_flags |= CPU_SET_ROOT;
>> + cpuset_iroot = &set->cs_mask;
>> + /*
>> + * Initialize the unit allocator. Reserve 0..15 for special sets.
>> + */
>> + cpuset_unr = new_unrhdr(CPUSET_RESERVED_MAX + 1, INT_MAX, NULL);
> Style requests that multi-line block comments have empty line before.
> Spelling CPUSET_RESERVED_MAX as 15 in the comment looks strange.
>
>> +}
>>
>> - return (set);
>> +struct cpuset *
>> +cpuset_thread0(void)
>> +{
>> +
>> + cpuset_init_sets();
>> +
> Empty line.
>
>> + return (cpuset_lookup_builtin(CPUSET_SET1));
>> }
>>
>> /*
>> Index: sys/kern/kern_intr.c
>> ===================================================================
>> --- sys/kern/kern_intr.c
>> +++ sys/kern/kern_intr.c
>> @@ -318,7 +318,7 @@
>> if (ie->ie_thread != NULL) {
>> CPU_ZERO(&mask);
>> if (cpu == NOCPU)
>> - CPU_COPY(cpuset_root, &mask);
>> + CPU_COPY(cpuset_iroot, &mask);
>> else
>> CPU_SET(cpu, &mask);
>> id = ie->ie_thread->it_thread->td_tid;
>> @@ -334,7 +334,7 @@
>> if (ie->ie_thread != NULL) {
>> CPU_ZERO(&mask);
>> if (ie->ie_cpu == NOCPU)
>> - CPU_COPY(cpuset_root, &mask);
>> + CPU_COPY(cpuset_iroot, &mask);
>> else
>> CPU_SET(ie->ie_cpu, &mask);
>> id = ie->ie_thread->it_thread->td_tid;
>> @@ -381,7 +381,7 @@
>> * If we're setting all cpus we can unbind. Otherwise make sure
>> * only one cpu is in the set.
>> */
>> - if (CPU_CMP(cpuset_root, mask)) {
>> + if (CPU_CMP(cpuset_iroot, mask)) {
>> for (n = 0; n < CPU_SETSIZE; n++) {
>> if (!CPU_ISSET(n, mask))
>> continue;
>> @@ -409,7 +409,7 @@
>> CPU_ZERO(mask);
>> mtx_lock(&ie->ie_lock);
>> if (ie->ie_cpu == NOCPU)
>> - CPU_COPY(cpuset_root, mask);
>> + CPU_COPY(cpuset_iroot, mask);
>> else
>> CPU_SET(ie->ie_cpu, mask);
>> mtx_unlock(&ie->ie_lock);
>> @@ -461,6 +461,7 @@
>> TD_SET_IWAIT(td);
>> thread_unlock(td);
>> td->td_pflags |= TDP_ITHREAD;
>> + td->td_cpuset = cpuset_lookup_builtin(CPUSET_ITHREAD);
>> ithd->it_thread = td;
>> CTR2(KTR_INTR, "%s: created %s", __func__, name);
>> return (ithd);
>> @@ -485,6 +486,7 @@
>> TD_SET_IWAIT(td);
>> thread_unlock(td);
>> td->td_pflags |= TDP_ITHREAD;
>> + td->td_cpuset = cpuset_lookup_builtin(CPUSET_ITHREAD);
>> ithd->it_thread = td;
>> CTR2(KTR_INTR, "%s: created %s", __func__, name);
>> return (ithd);
>> Index: sys/sys/cpuset.h
>> ===================================================================
>> --- sys/sys/cpuset.h
>> +++ sys/sys/cpuset.h
>> @@ -81,6 +81,10 @@
>> */
>> #define CPUSET_INVALID -1
>> #define CPUSET_DEFAULT 0
>> +#define CPUSET_SET1 1
>> +#define CPUSET_ITHREAD 2
>> +
>> +#define CPUSET_RESERVED_MAX 15
>>
>> #ifdef _KERNEL
>> LIST_HEAD(setlist, cpuset);
>> @@ -110,11 +114,12 @@
>> #define CPU_SET_ROOT 0x0001 /* Set is a root set. */
>> #define CPU_SET_RDONLY 0x0002 /* No modification allowed. */
>>
>> -extern cpuset_t *cpuset_root;
>> +extern cpuset_t *cpuset_root, *cpuset_iroot;
>> struct prison;
>> struct proc;
>>
>> struct cpuset *cpuset_thread0(void);
>> +struct cpuset *cpuset_lookup_builtin(cpusetid_t setid);
>> struct cpuset *cpuset_ref(struct cpuset *);
>> void cpuset_rel(struct cpuset *);
>> int cpuset_setthread(lwpid_t id, cpuset_t *);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D141_4.diff
Type: text/x-patch
Size: 7079 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20140605/de235f9b/attachment.bin>
More information about the freebsd-hackers
mailing list