cpuset_setithread and PROC_LOCK vs. thread_lock usage: Is this correct?

Tue May 30 01:48:37 UTC 2017

[FYI: The source referenced is from head
-r317820.]

In the process of trying to get evidence
related to a periodic/random kernel panic
for TARGET_ARCH=powerpc I ran into what
follows. I've no evidence that it is tied
to the panics. It is just something that
I read that looked a little odd to me.

But it looked like a good thing to ask
about, at least based on my level of
ignorance of such things for the kernel.
(The more I learn the better.)

I normally see the likes of:

                thread_lock(td);
                tdset = td->td_cpuset;

and:

        thread_lock(td);
        error = cpuset_shadow(td->td_cpuset, nset, mask);

but in cpuset_setithread the comments read like
a PROC_LOCK is in use instead:

int
cpuset_setithread(lwpid_t id, int cpu)
{
. . .
        struct proc *p;
. . .
        error = cpuset_which(CPU_WHICH_TID, id, &p, &td, &old_set);
        if (error != 0 || ((cs_id = alloc_unr(cpuset_unr)) == CPUSET_INVALID))
                goto out;

        /* cpuset_which() returns with PROC_LOCK held. */
        old_set = td->td_cpuset;
. . .

That seems true for !error but is a different
lock than used elsewhere (those thread_lock's).

cpuset_which does seem to end up with the PROC_LOCK
as indicated:

int
cpuset_which(cpuwhich_t which, id_t id, struct proc **pp, struct thread **tdp,
    struct cpuset **setp)
{
. . .
        case CPU_WHICH_TID:
                if (id == -1) {
                        PROC_LOCK(curproc);
                        p = curproc;
                        td = curthread;
                        break;
                }
                td = tdfind(id, -1);
                if (td == NULL)
                        return (ESRCH);
                p = td->td_proc;
                break;
. . .
        error = p_cansched(curthread, p);
        if (error) {
                PROC_UNLOCK(p);
                return (error);
        }
        if (td == NULL)
                td = FIRST_THREAD_IN_PROC(p);
        *pp = p;
        *tdp = td;
        return (0);
}

(tdfind does rw_rlock on tidhash_lock,
PROC_LOCK on td->td_proc. It only
PROC_UNLOCK's for PRS_NEW and
is comments to return with the proc
lock held. No thread_lock's involved.)

So it appears to me that in cpuset_setithread :

        /* cpuset_which() returns with PROC_LOCK held. */
        old_set = td->td_cpuset;

might happen with no thread_lock protecting the
td use (say to avoid reading during updates). (This
might also apply to what old_set points to.)

Similarly for:

cpuset_setithread(lwpid_t id, int cpu)
{
. . .
        struct cpuset *parent, *old_set;
. . .
        error = cpuset_shadow(parent, nset, &mask);
. . .

it does not appear that a thread_lock is involved
around the cpuset_shadow operation, unlike
elsewhere.

Is this lack of thread_lock involvement in fact
okay for some reason in each case?

===
Mark Millard
markmi at dsl-only.net