PERFORCE change 124529 for review

Jung-uk Kim jkim at FreeBSD.org
Mon Aug 27 14:24:13 PDT 2007


On Monday 27 August 2007 02:57 pm, Ken Smith wrote:
> On Mon, 2007-08-27 at 14:16 -0400, Jung-uk Kim wrote:
> > sched_{get,set}affinity() are very misleading syscalls.  Userland
> > (glibc) and kernel have different definitions and Roman's patch
> > implemented linux 2.6 kernel behaviour, AFAIK.  Glibc wraps all
> > differences between kernel versions.  See the following link:
> >
> > http://jeff.squyres.com/journal/archives/2005/10/linux_processor.
> >html
>
> Wow, "misleading" may be the understatement of the year...  :-(
>
> My only interest is not committing something that a user-level
> Linux binary running on FreeBSD will be confused by if it were to
> go through our compat layers.  To that end I'm going to be a bit of
> a jerk on this and I apologize for that.  As far as I can tell you
> are exactly right that this syscall in Linux is extremely
> confusing.  What I ask is someone to point me at something that
> suggests what the submitted patch implements was at least in use in
> *some* Linux kernel, preferrably one that saw semi-widespread use. 
> :-)
>
> I did follow up on this a bit myself by downloading what appears to
> be the bleeding edge of the Linux kernel which is probably not the
> right thing to do.  Its implementation in that kernel
> (linux-2.6.22) is this:
>
> long sched_getaffinity(pid_t pid, cpumask_t *mask)
> {
>         struct task_struct *p;
>         int retval;
>
>         mutex_lock(&sched_hotcpu_mutex);
>         read_lock(&tasklist_lock);
>
>         retval = -ESRCH;
>         p = find_process_by_pid(pid);
>         if (!p)
>                 goto out_unlock;
>
>         retval = security_task_getscheduler(p);
>         if (retval)
>                 goto out_unlock;
>
>         cpus_and(*mask, p->cpus_allowed, cpu_online_map);
>
> out_unlock:
>         read_unlock(&tasklist_lock);
>         mutex_unlock(&sched_hotcpu_mutex);
>         if (retval)
>                 return retval;
>
>         return 0;
> }
>
> The security_task_getscheduler() call seems to be a no-op at the
> moment (a work in progress - as far as I can tell there is only one
> task_getscheduler() function implemented at the moment and it
> always returns 0).
>
> As the reference you provided said there does seem to be the
> possibility of "interference" from the glibc code but as far as I
> can tell from what I have access to none of the various options
> would wind up returning anything other than zero in the case of
> success and that is what has me worried.  If anyone can point me at
> something that shows a case where the size of the mask really winds
> up being the return value upon success I'm totally willing to
> approve this.
>
> Sorry for the hassle.  Thanks.

You missed actual syscall entry point:

http://lxr.linux.no/source/kernel/sched.c#L4542

It actually returns sizeof(cpumask_t). ;-)

Jung-uk Kim


More information about the p4-projects mailing list