svn commit: r201512 - in head: include lib/libc/gen usr.bin/catman usr.bin/makewhatis usr.sbin/lpr/common_source

Bruce Evans brde at optusnet.com.au
Mon Jan 4 17:30:01 UTC 2010


On Mon, 4 Jan 2010, Konstantin Belousov wrote:

> Log:
>  Modernize scandir(3) and alphasort(3) interfaces according to the IEEE
>  Std 1003.1-2008.

s/Modernize/Break.  (I sent private mail about this, but too late.)  The
old interfaces were correct, since they were compatible with qsort(),
unlike the new interfaces.  The new interfaces can be used, but are
just harder and/or slower to use correctly.  That they are harder to use
correctly is shown by their incorrect use (=> undefined behaviour) here.

>  Both Linux and Solaris conforms to the new definitions,
>  so we better follow too (older glibc used old BSDish alphasort prototype
>  and corresponding type of the comparision function for scandir). While
>  there, change the definitions of the functions to ANSI C and fix several
>  style issues nearby.

Hopefully they won't remain broken.  Linux seems to have been broken
relatively.  According to google and glibc-2.6 sources, some of the
history of this bug is:
- Linux libc4 and libc5 (pre glibc-2.0 (?)) had the "more precise" [i.e.,
   broken] POSIX prototype.
- glibc-2.0 reverted to the "less precise" [i.e., correct] BSD prototype.
- glibc-2.3 from May 18 2007 has the BSD prototype
- glibc was changed to have the POSIX prototype on Mar 16 2009.

>  POSIX also requires that alphasort(3) sorts as if strcoll(3) was used,
>  but leave the strcmp(3) call in the function for now.

ISTR a thread on the POSIX mailing list saying that strcoll() is unusable
here.  Directory entries may contain fairly raw bytes which might not
be understood by strcoll().

>  Adapt in-tree callers of scandir(3) to new declaration. The fact that
>  select_sections() from catman(1) could modify supplied struct dirent is
>  a bug.

I don't see any bug here, except another one in the POSIX prototype.
catman is allowed to modify its own dirents, and does so.  This
modification may even be required for its call to scandir() to work,
and such modifications may even be useful for working around the
strcoll() bug -- you could reasonably modify the names in the dirents 
to ensure that strcoll() can handle them.  However, such modifications
shouldn't be allowed by scandir(), especially for the comparison function
(select_sections() is the select function).  However2, scandir()'s new
prototype doesn't say this -- it uses "const struct dirent **" where
strict const'ness would require "struct dirent const * const *" (parse?).
I think it does this for the same reason that the execve() and strtol()
families leave out one `const' -- 2 'const' cause too many problems.
With the old interface, there is necessarily only 1 `const', so type
safety is incomplete in another (mostly less useful) way.

> Modified: head/lib/libc/gen/scandir.c
> ==============================================================================
> --- head/lib/libc/gen/scandir.c	Mon Jan  4 15:34:49 2010	(r201511)
> +++ head/lib/libc/gen/scandir.c	Mon Jan  4 15:40:17 2010	(r201512)
> @@ -58,11 +58,9 @@ __FBSDID("$FreeBSD$");
> 	    (((dp)->d_namlen + 1 + 3) &~ 3))
>
> int
> -scandir(dirname, namelist, select, dcomp)
> -	const char *dirname;
> -	struct dirent ***namelist;
> -	int (*select)(struct dirent *);
> -	int (*dcomp)(const void *, const void *);
> +scandir(const char *dirname, struct dirent ***namelist,
> +    int (*select)(const struct dirent *), int (*dcomp)(const struct dirent **,
> +	const struct dirent **))
> {
> 	struct dirent *d, *p, **names = NULL;
> 	size_t nitems = 0;
> @@ -111,26 +109,25 @@ scandir(dirname, namelist, select, dcomp
> 	}
> 	closedir(dirp);
> 	if (nitems && dcomp != NULL)
> -		qsort(names, nitems, sizeof(struct dirent *), dcomp);
> +		qsort(names, nitems, sizeof(struct dirent *),
> +		    (int (*)(const void *, const void *))dcomp);

This bogus cast prevents detection of undefined behaviour.  Casting
dcomp to an `int (*)(const void *, const void *)' does not make it a
function of that type.

The old code was correct.

> 	*namelist = names;
> -	return(nitems);
> +	return (nitems);
>
> fail:
> 	while (nitems > 0)
> 		free(names[--nitems]);
> 	free(names);
> 	closedir(dirp);
> -	return -1;
> +	return (-1);
> }
>
> /*
>  * Alphabetic order comparison routine for those who want it.
>  */
> int
> -alphasort(d1, d2)
> -	const void *d1;
> -	const void *d2;
> +alphasort(const struct dirent **d1, const struct dirent **d2)

Undefined behaviour results when this function is called (if dcomp =
alphasort).  qsort() can only know to call a function of
type `int (*)(const void *, const void *)', and alphasort() doesn't
have that type.

> {
> -	return(strcmp((*(struct dirent **)d1)->d_name,
> -	    (*(struct dirent **)d2)->d_name));

The weak type checking let it get away with no const's at all here.

> +
> +	return (strcmp((*d1)->d_name, (*d2)->d_name));

Further undefined behaviour results when the parameters are accessed.

The undefined behaviour can be avoided using a wrapper function.  This
would be ugly for scandir(), since dcomp() would have to be passed as
a global.  The wrapper function would convert from (const void *d1,
const void *d2) (the parameters passed to a qsort() comparison function)
to (struct dirent **d1, struct dirent **d2) (the parameters accepted
by a broken POSIX comparison function), using essentially the same
code as the old version of the above.  But this is a stupid way to
implement a qsort() comparision function -- both ways need to do the
conversion on every call, but the broken way requires an extra function
to do it, in a context where inlining is usually impossible.  These
conversions are usually simple but will require extra register or
stack copying with the extra function call.

> }

Similar undefined behaviour has been fixed in a few utilities in FreeBSD.

> Modified: head/usr.bin/catman/catman.c
> ==============================================================================
> --- head/usr.bin/catman/catman.c	Mon Jan  4 15:34:49 2010	(r201511)
> +++ head/usr.bin/catman/catman.c	Mon Jan  4 15:40:17 2010	(r201512)
> @@ -589,9 +589,15 @@ process_section(char *mandir, char *sect
> }
>
> static int
> -select_sections(struct dirent *entry)
> +select_sections(const struct dirent *entry)
> {
> -	return directory_type(entry->d_name) == MAN_SECTION_DIR;
> +	char *name;
> +	int ret;
> +
> +	name = strdup(entry->d_name);
> +	ret = directory_type(name) == MAN_SECTION_DIR;
> +	free(name);
> +	return (ret);
> }

I think the modification is not needed, and also doesn't actually happen
here (since directory names cannot contain slashes).

Bruce


More information about the svn-src-head mailing list