powerpc64 head -r344018 stuck sleeping problems: th->th_scale * tc_delta(th) overflows unsigned 64 bits sometimes [patched failed]

Mark Millard marklmi at yahoo.com
Fri Mar 1 07:39:19 UTC 2019


[The new, trial code also has truncation occurring.]

On 2019-Feb-28, at 17:55, Mark Millard <marklmi at yahoo.com> wrote:

> [The PowerMac becomes non-responsive for significant periods of time.]
> 
> On 2019-Feb-28, at 07:08, Konstantin Belousov <kib at freebsd.org> wrote:
> 
>> On Thu, Feb 28, 2019 at 04:55:42PM +0200, Konstantin Belousov wrote:
>>> On Thu, Feb 28, 2019 at 05:06:23AM -0800, Mark Millard via freebsd-ppc wrote:
>>>> . . .
>>> 
>>> . . .
>> 
>> Of course I botched the formula, please try this instead:
>> 
>> diff --git a/sys/kern/kern_tc.c b/sys/kern/kern_tc.c
>> index 2656fb4d22f..fdd4f4f6a52 100644
>> --- a/sys/kern/kern_tc.c
>> +++ b/sys/kern/kern_tc.c
>> @@ -355,13 +355,22 @@ void
>> binuptime(struct bintime *bt)
>> {
>> 	struct timehands *th;
>> -	u_int gen;
>> +	uint64_t scale, x;
>> +	u_int delta, gen;
>> 
>> 	do {
>> 		th = timehands;
>> 		gen = atomic_load_acq_int(&th->th_generation);
>> 		*bt = th->th_offset;
>> -		bintime_addx(bt, th->th_scale * tc_delta(th));
>> +		scale = th->th_scale;
>> +		delta = tc_delta(th);
>> +		if (fls(scale) + fls(delta) > 63) {
>> +			x = (scale >> 32) * delta;
>> +			scale &= UINT_MAX;
>> +			bt->sec += x >> 32;
>> +			bintime_addx(bt, x << 32);
>> +		}
>> +		bintime_addx(bt, scale * delta);
>> 		atomic_thread_fence_acq();
>> 	} while (gen == 0 || gen != th->th_generation);
>> }
>> @@ -388,13 +397,22 @@ void
>> bintime(struct bintime *bt)
>> {
>> 	struct timehands *th;
>> -	u_int gen;
>> +	uint64_t scale, x;
>> +	u_int delta, gen;
>> 
>> 	do {
>> 		th = timehands;
>> 		gen = atomic_load_acq_int(&th->th_generation);
>> 		*bt = th->th_bintime;
>> -		bintime_addx(bt, th->th_scale * tc_delta(th));
>> +		scale = th->th_scale;
>> +		delta = tc_delta(th);
>> +		if (fls(scale) + fls(delta) > 63) {
>> +			x = (scale >> 32) * delta;
>> +			scale &= UINT_MAX;
>> +			bt->sec += x >> 32;
>> +			bintime_addx(bt, x << 32);
>> +		}
>> +		bintime_addx(bt, scale * delta);
>> 		atomic_thread_fence_acq();
>> 	} while (gen == 0 || gen != th->th_generation);
>> }
> 
> The PowerPC G5 ends up not responsive for long periods and
> responsive for only very short periods, such as being able
> to type in a few letters and have them show up at the time.
> 
> I've only barely started investigating what is going on
> and I'll be rechecking my instrumented variant for stupid
> mistakes and such. I"ll try your un-instrumented binuptime
> as well.
> 
> As stands I'll be updating the kernel via booting a 2nd
> disk that is not being experimented with.
> 
> My information gathering may not be very timely.
> 

Live experimenting and inspection has proved problematical.
Below I experiment with the prior scale_factor and tim_diff
figures from my oroginal code that recorded such, but
showing some of what your new code does for them. (In part
the below is text from the original list submittal to have
a context.)

Observed consistently for tc->tc_frequency:

tc->tc_frequency == 0x1fca055 (i.e., 33333333)

( tc->tc_counter_mask is 0xfffffffful as well. )

An example observation of diff_scaled having an overflowed
value was:

scale_factor            == 0x80da2067ac
scale_factor*freq overflows unsigned, 64 bit representation.
tim_offset              ==   0x3da0eaeb
tim_cnt                 ==   0x42dea3c4
tim_diff                ==    0x53db8d9
For reference:                0x1fc9d43 == 0xffffffffffffffffull/scale_factor
scaled_diff       == 0xA353A5BF3FF780CC (truncated to 64 bits)

But for the new, trail code:

0x80da2067ac is 40 bits
   0x53db8d9 is 27 bits
So 67 bits, more than 63. Then:

   x
== (0x80da2067ac>>32) * 0x53db8d9
== 0x80 * 0x53db8d9
== 0x29EDC6C80

   x>>32
== 0x2

   x<<32
== 0x9EDC6C8000000000 (limited to 64 bits)
Note the truncation of: 0x29EDC6C8000000000.

Thus the "bintime_addx(bt, x << 32)" is still
based on a truncated value.

I'll not bother with the other two examples unless you
want such.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)



More information about the freebsd-hackers mailing list