svn commit: r259609 - head/sys/kern

Thu Dec 19 11:16:18 UTC 2013

On Thu, 19 Dec 2013, Stefan Esser wrote:

> Log:
>  Fix overflow for timeout values of more than 68 years, which is the maximum
>  covered by sbintime (LONG_MAX seconds).

Not LONG_MAX seconds, but INT32_MAX seconds.  LONG_MAX seconds is about 2**32
times larger on 64-bit arches.

sbintimes and their complexity give many more possibilities for overflow.
I thought that the new overflow bugs and some old ones were already  fixed.

>  Some programs use timeout values in excess of 1000 years. The conversion
>  to sbintime caused wrap-around on overflow, which resulted in short or
>  negative timeout values. This caused long delays on sockets opened by
>  affected programs (e.g. OpenSSH).
>
>  Kernels compiled without -fno-strict-overflow were not affected, apparently
>  because the compiler tested the sign of the timeout value before performing
>  the multiplication that lead to overflow.

I think it just gave a garbage value that was accidentally less harmful.

>  When the -fno-strict-overflow option was added to CFLAGS, this optimization
>  was disabled and the test was performed on the result of the multiplication.
>  Negative products were caught and resulted in EINVAL being returned, but
>  wrap-around to positive values just shortened the timeout value to the
>  residue of the result that could be represented by sbintime.

This shows one reason why -fno-strict-overflow shouldn't be used.  It just
wastes time to give different undefined behaviour with undocumented details.

>  The fix is to cap the timeout values at the maximum that can be represented
>  by sbintime, which is 2^31 - 1 seconds or more than 68 years.

2^31 - 1 is a correct spelling of INT32_MAX, unlike LONG_MAX.

>  After this change, the kernel can be compiled with -fno-strict-overflow
>  with no ill effects.

> Modified: head/sys/kern/kern_event.c
> ==============================================================================
> --- head/sys/kern/kern_event.c	Thu Dec 19 07:33:07 2013	(r259608)
> +++ head/sys/kern/kern_event.c	Thu Dec 19 09:01:46 2013	(r259609)
> @@ -526,7 +526,8 @@ knote_fork(struct knlist *list, int pid)
> static __inline sbintime_t
> timer2sbintime(intptr_t data)
> {
> -
> +	if (data > LLONG_MAX / SBT_1MS)
> +		return LLONG_MAX;
> 	return (SBT_1MS * data);
> }

This has the following style bugs:
- it removes the empty line after the (null) declarations
- it is missing parentheses around the return value
- it uses the long long abomination

This has the following type errors:
- using the long long abomination is not just a style bug.  sbintime_t has
   type int64_t, not long long.  The overflow check will break if long long
   becomes longer than int64_t
- returning LLONG_MAX is usually just a style bug.  When long long is longer
   than int64_t, the return value will overflow, but the implementation-defined
   behaviour for this is usually to convert it to the correct value.

INT64_MAX is a dangerous default maximum timeout.  You can't even add the
minimum sbintime_t of 2**-32 seconds to this without overflowing.

Old timeout code used the following method to reduce the problem of
overflowing timeouts:
   For alarm() and setitimer(), limit the timeout to 100 million seconds
   (3.17 years) and return EINVAL if it is too large.  This works with
   32-bit time_t until about 2035.  POSIX doesn't allow this.  Clamping
   the time to 100 million seconds isn't allowed either, since the
   residual time can be seen from applications.  Note that 64-bit time_t
   has little effect on this problem.  Uses wishing to overflow the kernel
   simply ask for a timeout in a timeval of INT64_MAX seconds plus 999999
   seconds so that adding just 1 microseconds to it overflow.  With
   seconds in sbintime_t instead of in time_t, overflow occurs much
   sooner.

This is quite broken now:
- the limit of 100000000 used to be enforced in itimerfix(), but this was
   removed in connection with implementing POSIX realtime timers without
   even breaking the documentaion to match or touching PR(s) about the
   POSIX non-conformance of the limit.
- setitimer() mostly uses sbintimes now, so the old limit is irrelevant
   and enforcing it in itimerfix() would break the new sbintime code and
   presumably the not so new realtime timers code.  The sbintime code
   uses a modified limit that is also not allowed by POSIX.  This limit is
   of course undocumented.  It is INT32_MAX / 2, so that there is plenty
   of room for expansion.  This is what the above should use, modulo the
   conformance bug.  EINVAL is returned when this limit is exceeded.
- select() and poll() were more careful and seem to be correct now.
   The timeout in poll() is limited to INT_MAX milliseconds, so it fits
   in sbintime_t.  Except this assumes that int is 32 bits.  select()
   limits the timeout to INT32_MAX seconds so that it is representable
   as an sbintime_t and then does a further overflow check involving
   INT64_MAX.  When overflow would occur, it sets asbt to -1, which I
   think means an infinite timeout.  I don't like this.  -1 is a tiny
   amount less than 0, not near plus infinity.
- nanosleep() used to be less careful than select() and poll(), but seems
   to have been fixed.  For some reason, it uses a mixture of the above
   methods.  It limits sleep times to INT32_MAX / 2 seconds but repeats
   the sleep as necessary so as not to return an error.  The logic for
   this is subtle and I haven't checked it.

davide@ has patches related to fixing this, but none seem to have been
committed yet, except some old ones that gave some of the above overflow
checking.

Bruce