svn commit: r292777 - in head: lib/libc/sys sys/kern

Bruce Evans brde at optusnet.com.au
Mon Dec 28 10:17:55 UTC 2015


On Mon, 28 Dec 2015, Konstantin Belousov wrote:

> On Mon, Dec 28, 2015 at 09:35:11AM +1100, Bruce Evans wrote:
>> If this causes a panic, then it is from a sanity check detecting the
>> invalid conversion later.  A negative value in days breaks the loop
>> logic but seems to give premature exit from the loops instead of many
>> iterations.
> It causes the panic due to out of bound accesses to bin2bcd_data[] array.
> The issue affects around twenty rtc drivers, according to the quick grep
> for clock_ts_to_ct() usage.  It probably also affects geom raid modules,
> but I did not looked there at all.
>
> As I understand, people prefer to have ability to test practically
> useless values for the current time, by the cost the unplugged easy
> kernel panic, in the production systems ? Am I right ?

It is not unreasonable to panic when such tests fail, just like for other
settings of unreasonable values.  Only the superuser can make them, and
the superuser should know better than to run them on production systems.

Of course, the correct fix is to fail for unrepresentable values.  Different
subsystems have different limits, so valid negative timevals may become
invalid for certain RTC hardware or software.  Valid positive timevals
may also become invalid.  The RTC might not have a century register, or
one that it has might be unreliable as on x86 (the option USE_RTC_CENTURY
tells atrtc to use the century register, but this is so little used that
it is not a supported option).  We use a hack to translate to a range of
years including the current time.  This used to be done almost correctly.
It gave the range 1970-2069, so the Epoch as representable except in the
Western hemisphere when the RTC is on local time.  Now the range is 1980-
2079, so the Epoch is never representable.  If someone tries to set the
time to the Epoch, they get 2070 instead of 1970.

> The commit gave
> the immediate relief for the issue. If somebody have the correct fix
> for clock_ts_to_ct(), I am happy to see this commit reverted after the
> proper fix.

It only avoids for some out of bounds value.  I already pointed out that
negative seconds still occur in the Western hemisphere for times near the
Epoch, etc.

Other overflow bugs are now easier to see.  64-bit time_t's allow really
large times.  The maximum with signed ones is about 292 billion years.
'year' has the wrong type (int) so it cannot represent that many.  I
think the loop to reduce 'days' iterates that many times and 'year'
overflows every 2**32'nd iteration and the result is normally as if
calculated correctly but then assigned to a 32-bit signed int.  It can
overflow to any 32-bit signed value.  Fortunately, the calculation
doesn't overrun a table of leap years.  It just uses a simple rule for
leap years which is expected to work until 2037.  Years larger than
that are not supported elsewhere.  Next, the overflowed year is assigned
to ct->year.  This is also int, so no further overflow occurs.  ct->year
is misnamed (missing a prefix) and has a comment saying that it is a
4-digit year, but the code can produce any value.  Next, ct->year is
trusted by at least the atrtc driver.  This gives an out of bound
accesses to bin2bcd_data[] in about 49.5% of cases if USE_RTC_CENTURY
is not configured, and in almost all cases if USE_RTC_CENTURY is configured:

X 	writertc(RTC_YEAR, bin2bcd(ct.year % 100));	/* Write back Year    */
X #ifdef USE_RTC_CENTURY
X 	writertc(RTC_CENTURY, bin2bcd(ct.year / 100));	/* ... and Century    */
#endif

ct.year overflows to a negative value in about 50% of the cases.  Then
ct.year % 100 is negative in 99% of the subcases and 0 in 1% of the
subcases.

ct.year / 100 is negative in about 50% of the cases.  When it is positive,
it is too large for the array except for the small range 0-9999 for ct.year.
The comment on ct->year is correct that the year must have only 4 digits.
Anything larger than that takes more than 2 bin2bcd steps and more than
2 bcd registers to represent.  Probably other drivers are sloppy as above,
but they should be able to handle 4 digits by throwing away the century
digits as in the usual case above.

The adjustment to give times between 1980 and 2079 is for the other
direction.  That direction is more careful since the time comes from
untrusted hardware instead of the trusted superuser.  I pointed out
some missing checks on lower bounds in clock_ct_to_ts().  The one
for the month is not missing and is most important since the month
is used as an array index and has base 1.  The month day also has base 1
but is not used as an array index.  Buggy hardware might produce
0 but can't produce a negative value for seconds minutes or hours if it
uses bcd encoding.  This leaves the year as most in need of a lower
bounds check.  The code is accidentally fail-safe for years -- years
before 1970 are converted to 1970 except for the leap year caclulation.

subr_clock.c is a little under-engineered.  subr_fattime.c is grossly
over-engineered, but not enough to have the necessary overflow checking.
Its overflow bugs are easy to exercise using 64-bit time_t's.  Just
use utimes() or even utime() to try to set huge or negative times.
There are lots of overflows, but the end result is in bounds because
range checks are done on garbage values produced by the overflows, so
the garbage doesn't get far unless it is within bounds.

Though ot is unimportant, subr_fattime.c needs to be more careful with
overflow than subr_clock.c since using it is not restricted to the
kernel and the super-user.

Bruce


More information about the svn-src-all mailing list