svn commit: r297039 - head/sys/x86/x86

Thu Mar 24 14:57:38 UTC 2016

On Thu, 24 Mar 2016, Konstantin Belousov wrote:

> On Wed, Mar 23, 2016 at 02:21:59PM -0700, John Baldwin wrote:
>> As you noted, the issue is if a timecounter needs locks (e.g. i8254) though
>> outside of that I think the patch is great. :-/  Of course, if the TSC
>> isn't advertised as invariant, DELAY() is talking to the timecounter
>> directly as well.
>>
>> However, I think we probably can use the TSC.  The only specific note I got
>> from Ryan (cc'd) was about the TSC being unstable as a timecounter under KVM.
>> That doesn't mean that the TSC is non-mononotic on a single vCPU.  In fact,
>> thinking about this more I have a different theory to explain how the TSC
>> can be out of whack on different vCPUs even if the hardware TSC is in sync
>> in the physical CPUs underneath.
> In fact, if we can use TSC with the only requirement of being monotonic,
> I do not see why do we need TSC at all. We can return to pre-r278325
> loop, but calibrate the number of loop iterations for known delay in
> 1us, once on boot.  Do you agree with this ?

As a comment in the previous version says, the old method is highly
bogus since it is sensitive to CPU clock speed.

My systems allow speed variations of about 4000:800 = 5:1 for one CPU and
about 50:1 for different CPUs.  So the old method gave a variation of up
to 50:1.  This can be reduced to only 5:1 using the boot-time calibration.

Old versions of DELAY() had similar problems.  E.g., in FreeBSD-7:

X 	if (tsc_freq != 0 && !tsc_is_broken) {
X 		uint64_t start, end, now;
X 
X 		sched_pin();
X 		start = rdtsc();
X 		end = start + (tsc_freq * n) / 1000000;
X 		do {
X 			cpu_spinwait();
X 			now = rdtsc();
X 		} while (now < end || (now > start && end < start));
X 		sched_unpin();
X 		return;
X 	}

This only works if the TSC is invariant.  tsc_freq_changed() updates
tsc_freq, but it is too hard to do this atomically, so the above is
broken when it is called while the frequency is being updated (and
this can be arranged by single stepping through the update code).
The bug might break at least the keyboard i/o used to do the single
stepping, depending on whether the DELAY()s in it are really needed
and on how much these DELAYS()s are lengthened or shortened.

This loop doesn't need to use the TSC, but can use a simple calibrated
loop since the delay only needs to be right to within a few usec of
a factor of 2 or so, or any longer time (so that it doesn't matter if
the thread is preempted many quanta).  The calibration just needs to
be updated whenever the CPU execution rate changes.

Newer CPUs give complications like the CPU execution rate being independent
of the (TSC) CPU frequency.  An invariant TSC gives an invariant CPU
frequency, but the execution rate may change in lots of ways for lots
of reasons.

The cputicker has similar problems with non-invariant CPUs.  It
recalibrates, but there are races and other bugs in the recalibration.
These bugs are small compared with counting ticks at old rate(s) as
being at the current rate.

Bruce