Re: git: e2650af157bc - main - Make CPU_SET macros compliant with other implementations

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Sun, 02 Jan 2022 23:47:27 UTC
On Mon, Jan 03, 2022 at 12:35:39AM +0100, Stefan Esser wrote:
> Am 02.01.22 um 23:16 schrieb Konstantin Belousov:
> > On Sun, Jan 02, 2022 at 10:45:14PM +0100, Stefan Esser wrote:
> >> Am 02.01.22 um 20:51 schrieb Antoine Brodin:
> [...]
> >> Python 3.8.12 (default, Dec 31 2021, 10:50:47)
> >>>>> import os
> >>>>> os.sched_getaffinity(0)
> >> Traceback (most recent call last):
> >>   File "<stdin>", line 1, in <module>
> >> OSError: [Errno 34] Result too large
> >>
> >> This is a Python interpreter problem: it seems that the wrapper
> >> for the sched_getaffinity() function that has been introduced by
> >> kib in <sched.h> is buggy.
> >>
> >> As a work-around I have added a patch to comment out the
> >> os.sched_getaffinity(0) call (which used to cause an Attribute
> >> error that was caught by try/except, before).
> >>
> >> See ports commit 507c189b2876.
> > 
> > Buggy in which way?
> 
> My assumption was that the wrapper in the Python interpreter in
> Modules/posixmodules.c function os_sched_getaffinity_impl() does
> not work with the FreeBSD implementation of sched_getaffinity().
> 
> The relevant code in the Python wrapper is:
> 
>     ncpus = NCPUS_START;
>     while (1) {
>         setsize = CPU_ALLOC_SIZE(ncpus);
>         mask = CPU_ALLOC(ncpus);
>         if (mask == NULL)
>             return PyErr_NoMemory();
>         if (sched_getaffinity(pid, setsize, mask) == 0)
>             break;
>         CPU_FREE(mask);
>         if (errno != EINVAL)
>             return posix_error();
>         if (ncpus > INT_MAX / 2) {
>             PyErr_SetString(PyExc_OverflowError, "could not allocate "
>                             "a large enough CPU set");
>             return NULL;
>         }
>         ncpus = ncpus * 2;
>     }
> 
> NCPUS_START is 8 * sizeof(unsigned long) = 64 on a 64 bit CPU.
> 
> > Our cpuset_getaffinity(2) syscall returns ERANGE for cpuset size not
> > equal to CPU_SETSIZE.  It seems that python source expects EINVAL in
> > this case.
> 
> Yes, anything except EINVAL will cause the loop to exit prematurely.
> 
> > I can change the wrapper to translate ERANGE to EINVAL.  sched_setaffinity()
> > probably would require a symmetrical patch, but lets postpone it.
> 
> Yes.
> 
> > diff --git a/lib/libc/gen/sched_getaffinity.c b/lib/libc/gen/sched_getaffinity.c
> > index 2ae8c5b763a3..8748d7a60278 100644
> > --- a/lib/libc/gen/sched_getaffinity.c
> > +++ b/lib/libc/gen/sched_getaffinity.c
> > @@ -26,11 +26,29 @@
> >   * SUCH DAMAGE.
> >   */
> >  
> > +#include <errno.h>
> >  #include <sched.h>
> > +#include <string.h>
> >  
> >  int
> >  sched_getaffinity(pid_t pid, size_t cpusetsz, cpuset_t *cpuset)
> >  {
> > +	/*
> > +	 * Be more Linux-compatible:
> > +	 * - return EINVAL in passed size is less than size of cpuset_t
> > +	 *   in advance, instead of ERANGE from the syscall
> > +	 * - if passed size is larger than the size of cpuset_t, be
> > +	 *   permissive by claming it back to sizeof(cpuset_t) and
> > +	 *   zeroing the rest.
> > +	 */
> > +	if (cpusetsz < sizeof(cpuset_t))
> > +		return (EINVAL);
> > +	if (cpusetsz > sizeof(cpuset_t)) {
> > +		memset((char *)cpuset + sizeof(cpuset_t), 0,
> > +		    cpusetsz - sizeof(cpuset_t));
> > +		cpusetsz = sizeof(cpuset_t);
> > +	}
> > +
> >  	return (cpuset_getaffinity(CPU_LEVEL_WHICH, CPU_WHICH_PID,
> >  	    pid == 0 ? -1 : pid, cpusetsz, cpuset));
> >  }
> 
> I have rebuilt the C library with this patch, but it did not fix
> the problem, since the value checked in the loop is errno, not
> the return code of sched_getaffinity().
I see, thank you for noting this.

> 
> The following code is tested to work:
> 
> #include <errno.h>
> #include <sched.h>
> 
> int
> sched_getaffinity(pid_t pid, size_t cpusetsz, cpuset_t *cpuset)
> {
>         int result;
> 
>         result = cpuset_getaffinity(CPU_LEVEL_WHICH, CPU_WHICH_PID,
>             pid == 0 ? -1 : pid, cpusetsz, cpuset);
> 
>         if (result && errno == ERANGE)
>                 errno = EINVAL;
> 
>         return (result);
> }

I want to be more permissive outright, in particular, allow larger cpusets
than kernel handles.  Updated patch is below.

diff --git a/lib/libc/gen/sched_getaffinity.c b/lib/libc/gen/sched_getaffinity.c
index 2ae8c5b763a3..a26d098deb83 100644
--- a/lib/libc/gen/sched_getaffinity.c
+++ b/lib/libc/gen/sched_getaffinity.c
@@ -26,11 +26,30 @@
  * SUCH DAMAGE.
  */
 
+#include <errno.h>
 #include <sched.h>
+#include <string.h>
 
 int
 sched_getaffinity(pid_t pid, size_t cpusetsz, cpuset_t *cpuset)
 {
+	/*
+	 * Be more Linux-compatible:
+	 * - return EINVAL in passed size is less than size of cpuset_t
+	 *   in advance, instead of ERANGE from the syscall
+	 * - if passed size is larger than the size of cpuset_t, be
+	 *   permissive by claming it back to sizeof(cpuset_t) and
+	 *   zeroing the rest.
+	 */
+	if (cpusetsz < sizeof(cpuset_t)) {
+		errno = EINVAL;
+		return (-1);
+	if (cpusetsz > sizeof(cpuset_t)) {
+		memset((char *)cpuset + sizeof(cpuset_t), 0,
+		    cpusetsz - sizeof(cpuset_t));
+		cpusetsz = sizeof(cpuset_t);
+	}
+
 	return (cpuset_getaffinity(CPU_LEVEL_WHICH, CPU_WHICH_PID,
 	    pid == 0 ? -1 : pid, cpusetsz, cpuset));
 }