svn commit: r340450 - head/sys/sys

Bruce Evans brde at optusnet.com.au
Mon Nov 19 15:20:08 UTC 2018


On Tue, 20 Nov 2018, Bruce Evans wrote:

> On Mon, 19 Nov 2018, Warner Losh wrote:
>
>> On Mon, Nov 19, 2018 at 12:31 AM Andriy Gapon <avg at freebsd.org> wrote:
>
>>> As a side note, I wonder if those functions are ever used on negative
>>> values,
>>> given the type of the argument, and if anyone checked their correctness in
>>> that
>>> case.
>> 
>> I don't think so, but an assert would be good to make sure.
>
> I checked them on all valid values.  They obviously can't work, since the
> constant is garbage.  Unfortunately, I lost the test program.  IIRC, it
> finds about 80 million out of 1 billion wrong results for nsec precision,
> 32 out of 1 million wrong for usec precision, and 0 out of 1000 wrong for
> msec precision.

I found my test program.

>> But I think I understand the problem now.
>> 
>> mstosbt(1000) is overflowing with my change, but not without because we're
>> adding 2^32 to a number that's ~900 away from overflowing changing a very
>
> This value is just invalid.  Negative values are obviously invalid.  Positive
> values of more than 1 second are invalid.  In the old version, values of
> precisely 1 second don't overflow since the scale factor is rounded down;
> the result is just 1 unit lower than 1 second.  Overflow (actually 
> truncation)
> occurs at 1 unit above 1 second.  E.g., sbttoms(mstosbt(1000)) was 999 and
> sbttoms(mstosbt(1001)) was 0.  Now in my fixed version, 
> sbttoms(mstosbt(1000))
> is truncated to 0 and sbttoms(mstosbt(1001)) is truncated to 1.
>
> The test program also showed that in the old version, all valid args except
> 0 are reduced by precisely 1 by conversion to sbt and back.
>
>> large sleep of 1 second to a tiny sleep of 0 (which is 1ms at the default
>> Hz). I think we get this in the mlx code because there's a msleep(1000) in
>> at least one place. msleep(1001) would have failed before my change. Now I
>> think msleep(999) works, but msleep(1000) fails. Since the code is waiting
>> a second for hardware to initialize, a 1ms instead is likely to catch the
>> hardware before it's finished. I think this will cause problems no matter
>> cold or not, since it's all pause_sbt() under the covers and a delay of 0
>> is still 0 either way.
>
> Bug in the mlx code.  The seconds part must be converted separately, as in
> tstosbt().

mlx doesn't seem to use sbt directly, or even msleep(), so the bug does
seem to be central.

mlx actually uses mtx_sleep() with timo = hz().   mtx_sleep() is actually
an obfuscated macro wrapping _sleep().  The conversion to sbt is done by
the macro and is sloppy.  It multiplies timo by tick_sbt.

tick_sbt is SBT_1S / hz, so the sbt is SBT_1S / hz * hz which is SBT_1S
reduced a little.  This is not affected by the new code, and it still isn't
converted back to 1 second in ms, us or ns.  Even if it is converted back
and then forth to sbt using the new code, it remains less than 1 second as
an sbt so shouldn't cause any new problems.

Here is the test program:

XX /* Test decimal<->sbt conversions. */
XX 
XX #include <sys/time.h>
XX 
XX #include <stdio.h>
XX 
XX int
XX main(int argc, char **argcv)
XX {
XX 	uint64_t sbt;
XX 	int i, j, wrong;
XX 
XX 	wrong = 0;
XX 	for (i = 0; i < 1000; i++) {
XX 		sbt = mstosbt(i);
XX 		j = sbttoms(sbt);
XX 		if (i != j) {
XX 			wrong++;
XX 			if (argc > 1)
XX 				printf("%d -> %#jx -> %d\n",
XX 				    i, (uintmax_t)sbt, j);
XX 		}
XX 	}
XX 	printf("%d wrong values for ms\n", wrong);
XX 
XX 	wrong = 0;
XX 	for (i = 0; i < 1000000; i++) {
XX 		sbt = ustosbt(i);
XX 		j = sbttous(sbt);
XX 		if (i != j) {
XX 			wrong++;
XX 			if (argc > 1)
XX 				printf("%d -> %#jx -> %d\n",
XX 				    i, (uintmax_t)sbt, j);
XX 		}
XX 	}
XX 	printf("%d wrong values for us\n", wrong);
XX 
XX 	wrong = 0;
XX 	for (i = 0; i < 1000000000; i++) {
XX 		sbt = nstosbt(i);
XX 		j = sbttons(sbt);
XX 		if (i != j) {
XX 			wrong++;
XX 			if (argc > 1)
XX 				printf("%d -> %#jx -> %d\n",
XX 				    i, (uintmax_t)sbt, j);
XX 		}
XX 	}
XX 	printf("%d wrong values for ns\n", wrong);
XX }

Output:
0 wrong values for ms
32 wrong values for us
82602428 wrong values for ns

Bruce


More information about the svn-src-all mailing list