svn commit: r296543 - head/sys/compat/linux

Wed Mar 9 08:16:38 UTC 2016

On Tue, 8 Mar 2016, Dmitry Chagin wrote:

> Log:
>  Put a commit message from r296502 about Linux alarm() system call
>  behaviour to the source.
> ...
> Modified: head/sys/compat/linux/linux_misc.c
> ==============================================================================
> --- head/sys/compat/linux/linux_misc.c	Tue Mar  8 19:08:55 2016	(r296542)
> +++ head/sys/compat/linux/linux_misc.c	Tue Mar  8 19:20:57 2016	(r296543)
> @@ -206,6 +206,11 @@ linux_alarm(struct thread *td, struct li
> 	it.it_value.tv_usec = 0;
> 	it.it_interval.tv_sec = 0;
> 	it.it_interval.tv_usec = 0;
> +	/*
> +	 * According to POSIX and Linux implementation
> +	 * the alarm() system call is always successfull.
> +	 * Ignore errors and return 0 as a Linux do.
> +	 */

Why does this need a comment referring to external sources?  FreeBSD's
own man page also says that there is no error return for alarm(3).
However, the man page and the implementation are quite broken.  The
implementation in libc does have an error return, but this is
undocumented.  The documentation says that that the maximum number of
seconds is 100000000 (100 million), but doesn't say what happens when
this limit is exceeded.  The implementation of this is broken too.
What actually happens for the libc version is:
- secs <= 100 million works in all kernel versions
- in old kernel versions, secs > 100 million fails to set the alarm;
   it sets errno to EINVAL and returns (u_int)-1 to misindicate the error
- in FreeBSD-~[6-9], secs > 100 million works with 64-bit time_t.  With
   32-bit time_t, secs > INT_MAX fails; secs between INT_MAX and about
   1000 million (= the time from now until overflow) causes undefined
   behaviour due to overflow, but this might simulate working; secs between
   about 1000 million and 100 million work.
- in FreeBSD-~[10-current], secs > INT32_MAX / 2 fail and other cases have
   a chance of working.  In the failing cases, the error is misindicated as
   in old versions except for the different threshold.

> 	kern_setitimer(td, ITIMER_REAL, &it, &old_it);

Removing the error check broke this completely.

> 	if (timevalisset(&old_it.it_value)) {
> 		if (old_it.it_value.tv_usec != 0)

old_it is stack garbage when kern_setitimer() fails.  The value in
this stack garbage is now returned and errno is not set.  When the
error was checked, the consistent garbage value (u_int)-1 was returned,
and errno was not set.

This function worked almost correctly in FreeBSD-5, by duplicating
most of the internals of setitimer() including its limit of 100 million
which was not broken then.  This limit was to avoid overflow with 32-bit
time_t unless the current time is after year 2035.  It was broken in
2005.  It remained just broken with 32-bit time_t until FreeBSD-9.
Starting in FreeBSD-10, sbintime_t is used.  This reduces the brokenness
with 32-bit time_t but increases it with 64-bit time_t.  sbintime_t has
the same signed 32-bit limit for seconds as 32-bit time_t.  New ittimer
code using sbintime_t is more careful about overflow, but it cannot
support alarm() or large times in setitimer() even with 64-bit time_t.
It actually enforces a limit of INT32_MAX / 2 (~ 1000 million) and
doesn't documement this, where old code enforces a limit of 100 million
and does document this.  Man pages still document the old limit, but
no implmentations of alarm() except the old one for linux are aware
of this.

Complete description of bugs in this function:

X int
X linux_alarm(struct thread *td, struct linux_alarm_args *args)
X {
X 	struct itimerval it, old_it;
X 	u_int secs;
X 
X #ifdef DEBUG
X 	if (ldebug(alarm))
X 		printf(ARGS(alarm, "%u"), args->secs);
X #endif
X 
X 	secs = args->secs;
X

Style bug: extra blank line to separate related code.  Strict KNF doesn't
even allow blank lines to separate unrelated code.  The one after the
DEBUG ifdef is an example.  But bugs in indent(1) give extra blank lines
for ifdefs.

X 	if (secs > INT_MAX)
X 		secs = INT_MAX;

A bounds check is needed, but this one is very wrong.  We are going to
assign secs to tv_sec, and need to ensure that this doesn't overflow.
tv_sec used to have type long, so the correct limit for this was LONG_MAX.
POSIX broke this type, and FreeBSD was broken to conform in 2005.  Then
the correct limit became TIME_T_MAX, but this is unavailable under that
spelling.  INT_MAX accidentally works as well as possible for its intended
purpose it time_t is 32 bits and int is also 32 bits, but if time_t is
64-bits then it unnecessarily breaks alarm() in versions before sbintime_t.

However, the limit of INT_MAX doesn't work for the purpose of breaking
alarm() as little as possible.  It just ensured that kern_setitimer()
and thus this function always fails if kern_settimer() enforced its
documented limit of 100 million.  I think the change to use this limit
was made after setitimer()'s limit was broken.  Then this limit worked
for a while with 64-bit time_t, but with 32-bit time_t, using it always
gave overflow by adding INT_MAX to the current time.  Now with
sbintime_t and its limit of INT32_MAX / 2, using INT_MAX here ensures
that kern_settimer() and thus this function always fails...

The best limit to use here is 100 million, or perhaps INT32_MAX / 2 to
match kern_setitimer()s new limit, after checking that this works
(sbintime_t must be careful not to add INT32_MAX / 2 to either the
current time or more than 1 other value near INT_MAX32 / 2).

X

Style bug: extra blank line.

X 	it.it_value.tv_sec = (long) secs;

Style bugs: bogus cast, and space before cast.  The cast was not incorrect
before POSIX broke the type of tv_sec, but it should never have been
necessary (for avoiding compiler warnings) since compilers should see
that the limited value fits in tv_sec.

X 	it.it_value.tv_usec = 0;
X 	it.it_interval.tv_sec = 0;
X 	it.it_interval.tv_usec = 0;

Style bug in previous 2 lines: the corresponding libc code uses
timevalclear().  (Normally I prefer explicit code, but this function uses
a macro later.)

X 	/*
X 	 * According to POSIX and Linux implementation
X 	 * the alarm() system call is always successfull.
X 	 * Ignore errors and return 0 as a Linux does.
X 	 */

Style bugs: dubious commit, and formatting of this comment for ~60 column
terminals.  There are many delicate points in the error (mis)handling,
but the comment only describes an uninteresting one.  Comments should
probably be formatted narrow terminals, but that is an unusual style.
Other comment in this file use random right margins but many go out to
nearly column 89.

X 	kern_setitimer(td, ITIMER_REAL, &it, &old_it);

Bug: lost error handling.

After fixing the above limit of INT_MAX, the error here should never
occur, and you could assert that, but the error always occurs now
if secs > INT32_MAX / 2.  secs <= INT32_MAX / 2 is so sure to work with
the current sbintime_t code that this is not worth asserting.  (Perhaps
it will fail due to overflow, but kern_setitimer() will succeed.)
secs <= 100000000 is even more sure to work (until 2035).

X 	if (timevalisset(&old_it.it_value)) {

Bug: beginning of accesses to the uninitialized variable old_it in the
error case.

Style bugs: unnecessary test, and inconsistent use of macros.  libc
doesn't do this test.  All it does is extra work in the test to avoid
doing anything more in the usual case where the old timeout has expired.

X 		if (old_it.it_value.tv_usec != 0)
X 			old_it.it_value.tv_sec++;
X 		td->td_retval[0] = old_it.it_value.tv_sec;

Bug: return of garbage in the error case.

X 	}
X 	return (0);
X }

Bruce