svn commit: r343636 - head/sbin/pfilctl

Fri Feb 1 06:59:51 UTC 2019

On Fri, 1 Feb 2019, Gleb Smirnoff wrote:

> Log:
>  Hopefully fix compilation by other compilers.

You mean "Hopefully fix compilation by compilers whose -Wnested-externs
support is not broken".

bsd.sys.mk sets -Wnested-externs at WARNS >= 6 and also -Wredundant-decls
to inhibit the engineering and style bug of declaring extern variables
outside of header files.  Certain compilers whose -Wnested-includes is
broken also have a broken -Wredundant-decls.  The brokenness is typically
to silently ignore these flags.  Even adding -pedantic doesn't fix this
in certain compilers.

> Modified: head/sbin/pfilctl/pfilctl.c
> ==============================================================================
> --- head/sbin/pfilctl/pfilctl.c	Fri Feb  1 00:33:17 2019	(r343635)
> +++ head/sbin/pfilctl/pfilctl.c	Fri Feb  1 00:34:18 2019	(r343636)
> @@ -94,9 +94,8 @@ main(int argc __unused, char *argv[] __unused)
> static void
> help(void)
> {
> -	extern char *__progname;
>
> -	fprintf(stderr, "usage: %s (heads|hooks|link|unlink)\n", __progname);
> +	fprintf(stderr, "usage: %s (heads|hooks|link|unlink)\n", getprogname());
> 	exit(0);
> }

Here __progname is an implementation detail, and it is intentionally not
declared in a header file.  Bad code like the above was chummy with the
implementation and declared it as part of the chumminess.  Compilers with
non-broken -Wnested-includes used to be more common and detected this bug.
If __progname were declared in an included header files, then compilers
with a non-broken -Wredundant-decls would detect another bug.

The change is to use the correct API.

This function has many other style bugs.  Normal programs spell help()
as usage().  Even this program still prints "usage: " and not "help: ".
There are a lot of style rules for usage(), and one is to normally use
a hard-coded name for the program and not use getprogname() or argv[0],
or even warnx().  warnx() would do the same thing as a normal usage(),
except it would print "%s: usage:" instead of "usage: %s", where %s
is getprogname() for warnx() and a hard-coded name for normal usage().
The warnx() order is better but is not traditional.

However, some programs like this one get different features according
to the program name given by argv[0].  rm is a good example of how to
do this, and this program is a bad example of how to do this.  rm has
2 name, rm and unlink.  It starts by taking the basename of argv[0]
(hackishly using strrchr(3) instead of basename(3)).  Then it prints
separate usage messages for rm and unlink using hard coded names for
both.  It could be improved by only printing the usage message for the
current name.

This program starts by not taking the basename of argv[0], so it fails
to find the correct program if argv[0] has a path prefix.  It has 4
alternative names where rm has only 2, and it uses the style bug
of !strcmp() to search its table where rm uses strcmp() == 0.  Later
it prints a wrong usage message.  The usage message should have 4
alternative hard-coded program names following by options relevant to
each name (as in rm), or only the usage relative to the current name.
Instead it has the current name followed by the syntax error of repeating
one of of the possible names, and no options.  The man page gives normally
formatted syntax, as for rm.

Usage messages should be checked to be lexically identical to man page
synopses after removing "usage: " and leading whitespace.  man(1) limits
line lengths, and it is important for usage() to limit line lengths
identically exept for the "usage: " prefix, for readability and for easy
comparison.  man(1) wraps long lines and comparison is too difficult
if usage() wraps differently.  man(1) leaves a large left margin and a
small right margin, and that is enough for the margin given by "usage: ".

One reason to hard-code names in usage() is that they have to be hard-
coded in the man page.  The action has to be the same as in the man page,
_especially_ for programs like rm and unlink where the name affects the
action.  Otherwise, the only thing that using the dynamic name does is
to slightly de-obfuscate obfuscations and security attacks like
"ln /bin/rm $HOME/bin/ls".

getprogname(3) is documented to "manipulate" the name of the program, but
no details of the manipulation are documented.  It actually returns
__progname.  __progname is set in too many different ways:
- csu uses a manually inlined version of strrchr() to find the base name
- rtld uses __progname = obj_rtld.path in 2 places and basename(argv[0])
   in 1 places.  The 2 places seem to be missing taking the base name.
- setprogname() uses strrchr() to find the base name
- using basename() is usually worse than the home made methods for most
   uses.  basename() has extras like converting NULL and "" to "." and
   removing trailing slashes.  The extras make it non-reentrant.
   basename() was fixed in 4.4BSD or earlier to allocate memory and take
   a const char * arg, but was later broken to POSIX spec.

setprogname() is documented to return a pointer into its string arg, so
that the string should not be modified.

The documentation of basename() is not so good.  It doesn't emphasize that
the string should not be modified.  It gives 2 implementation methods, with
the first one the current method and the second one the old BSD method, then
says that the "former" method is thread-safe.  It is actually the latter
method that is thread-safe.  It was only former in time.  Except it was
also latter in time (between 2 thread-unsafe implementations).

Stripping trailing slashes is usually wrong.  However, for program names
it is harmless, because program names can't name directories.  If they did
have trailing slashes, then the strrchr() method would be harmful -- it would
give "" for "ls/", while basename() gives "ls".  But the program name can't
be "ls/", since "ls/" is a directory if it exists.

Bruce