PERFORCE change 124529 for review

Ken Smith kensmith at cse.Buffalo.EDU
Mon Aug 27 12:22:40 PDT 2007


On Mon, 2007-08-27 at 14:16 -0400, Jung-uk Kim wrote:
> sched_{get,set}affinity() are very misleading syscalls.  Userland 
> (glibc) and kernel have different definitions and Roman's patch 
> implemented linux 2.6 kernel behaviour, AFAIK.  Glibc wraps all 
> differences between kernel versions.  See the following link:
> 
> http://jeff.squyres.com/journal/archives/2005/10/linux_processor.html

Wow, "misleading" may be the understatement of the year...  :-(

My only interest is not committing something that a user-level Linux
binary running on FreeBSD will be confused by if it were to go through
our compat layers.  To that end I'm going to be a bit of a jerk on this
and I apologize for that.  As far as I can tell you are exactly right
that this syscall in Linux is extremely confusing.  What I ask is
someone to point me at something that suggests what the submitted patch
implements was at least in use in *some* Linux kernel, preferrably one
that saw semi-widespread use.  :-)

I did follow up on this a bit myself by downloading what appears to be
the bleeding edge of the Linux kernel which is probably not the right
thing to do.  Its implementation in that kernel (linux-2.6.22) is this:

long sched_getaffinity(pid_t pid, cpumask_t *mask)
{
        struct task_struct *p;
        int retval;

        mutex_lock(&sched_hotcpu_mutex);
        read_lock(&tasklist_lock);

        retval = -ESRCH;
        p = find_process_by_pid(pid);
        if (!p)
                goto out_unlock;

        retval = security_task_getscheduler(p);
        if (retval)
                goto out_unlock;

        cpus_and(*mask, p->cpus_allowed, cpu_online_map);

out_unlock:
        read_unlock(&tasklist_lock);
        mutex_unlock(&sched_hotcpu_mutex);
        if (retval)
                return retval;

        return 0;
}

The security_task_getscheduler() call seems to be a no-op at the moment
(a work in progress - as far as I can tell there is only one
task_getscheduler() function implemented at the moment and it always
returns 0).

As the reference you provided said there does seem to be the possibility
of "interference" from the glibc code but as far as I can tell from what
I have access to none of the various options would wind up returning
anything other than zero in the case of success and that is what has me
worried.  If anyone can point me at something that shows a case where
the size of the mask really winds up being the return value upon success
I'm totally willing to approve this.

Sorry for the hassle.  Thanks.

-- 
                                                Ken Smith
- From there to here, from here to      |       kensmith at cse.buffalo.edu
  there, funny things are everywhere.   |
                      - Theodore Geisel |

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: This is a digitally signed message part
Url : http://lists.freebsd.org/pipermail/p4-projects/attachments/20070827/df2a666a/attachment.pgp


More information about the p4-projects mailing list