cvs commit: src/sys/conf options.i386 src/sys/i386/i386 tsc.c src/sys/i386/conf NOTES

Bruce Evans bde at zeta.org.au
Mon Apr 7 00:56:32 PDT 2003


On Mon, 7 Apr 2003, Dag-Erling Smorgrav wrote:

> "Poul-Henning Kamp" <phk at phk.freebsd.dk> writes:
> > In message <xzpof3js45j.fsf at flood.ping.uio.no>, Dag-Erling Smorgrav writes:
> > >                                              On most SMP systems, the
> > > PIIX timecounter is automatically selected by virtue of being
> > > discovered last.
> > It is specifically discovered last because it should be used if
> > at all possible.
>
> It's more precise than the TSC (mostly because the TSC calibration
> code sucks),

Um, the TSC calibration code was almost perfect until it was axed.  It
just calibrated relative to the RTC so it suffered from any inaccuracy
of the RTC, which can be just as large of as the inaccuracy of the
nominal i8254 frequency or have the opposite sign so that the combined
inaccuracy is larger.  The axing message gave lack of precision rather
than lack of accuracy, but that should be fixable.  The TSC frequency
just varies with temperature in a different way than the RTC, so of
course its frequency isn't always calibrated to nearly the same value --
that's becauee it isn't.  Also, the calibration was written on an i486
and isn't quite as careful as it could be about i/o accesses or taking
an averages.  I think this accounts for most of the remaining jitter.

Another thing the old code got right was calibration of the i8254 relative
to the TSC.  They obviously use the same hardware clock on at least
all of my active machines (BP6, A7V266-E), since the calibrated ratio
is always the same (as far as ntp running over a long period can tell).
The old calibration code calibrates them relative to another clock so
the only error in their relative frequencies is from the different time
that it takes to read their counters.  The i8254 counter typically takes
5 usec longer to read, but for some reason the actual error is less than
1 i8254 cycle on all of my active systems.

The current TSC calibration code has much the same 5 usec algorithmic
error, but for some reasons the actual error is closer to 7 usec than
(< 1 usec).  This is partly because the old calibration code goes
closer to the hardware than the current code -- it used getit() instead
of DELAY().  However, the error is easy to (mostly) compensate for by
calling DELAY() twice.  I currently use the following code.  It has lots
of debugging cruft and is in an old version of clock.c because I don't
use -current.  The 10+ second DELAY() is too long to use in production:

%%%
Index: clock.c
===================================================================
RCS file: /home/ncvs/src/sys/i386/isa/clock.c,v
retrieving revision 1.191
diff -u -2 -r1.191 clock.c
--- clock.c	4 Dec 2002 13:46:49 -0000	1.191
+++ clock.c	6 Apr 2003 08:30:48 -0000
@@ -793,4 +1061,6 @@
 #endif
 	if (tsc_present && tsc_freq == 0) {
+	u_int64_t tscval[6];
+
 		/*
 		 * Calibration of the i586 clock relative to the mc146818A
@@ -798,29 +1068,71 @@
 		 * to the i8254 clock.
 		 */
-		u_int64_t old_tsc = rdtsc();
-
+		orig_tsc = rdtsc();
 		DELAY(1000000);
-		tsc_freq = rdtsc() - old_tsc;
+		tsc_freq = rdtsc() - orig_tsc;
+
+	/*
+	 * Assume that `DELAY(n); tscval[CONST];' takes a constant time
+	 * longer than n usec.  Do things twice to cancel the constant.  But
+	 * first, call DELAY(1) to warm up any caches and possible DELAY()
+	 * itself.
+	 *
+	 * Unfortunately, this is still less precise than the "less precise"
+	 * code that was axed (mainly because it doesn't repeat the internals
+	 * of DELAY() and getit()).  The old code somehow made
+	 * tsc_freq / i8254_freq a constant to within better than 0.5 ppm on
+	 * machines where the perfectly precise value is a constant (because
+	 * the clocks are scaled from the same hardware clock), although there
+	 * is a potential loss of 1 ppm just reading an ISA register and 3
+	 * times that for 3 registers in getit().
+	 */
+cal:
+	DELAY(1);
+	tscval[0] = rdtsc();
+	DELAY(1000);
+	tscval[1] = rdtsc();
+	tscval[2] = rdtsc();
+	DELAY(1001000);
+	tscval[3] = rdtsc();
+	tscval[4] = rdtsc();
+	DELAY(11001000);
+	tscval[5] = rdtsc();
+
+	tsc_freq = tscval[3] - tscval[2] - (tscval[1] - tscval[0]);
+	if (bootverbose)
+		printf("TSC clock: %ju Hz\n", (uintmax_t)tsc_freq);
+	tsc_freq = (tscval[5] - tscval[4] - (tscval[3] - tscval[2])) / 10;
+	if (bootverbose)
+		printf("TSC clock: %ju Hz\n", (uintmax_t)tsc_freq);
+	if (bootverbose)
+		printf("raw: %ju %ju %ju %ju %ju %ju\n", tscval[0], tscval[1],
+		    tscval[2], tscval[3], tscval[4],tscval[5]);
+	if (bootverbose && cncheckc() == -1)
+		goto cal;
+
 #ifdef CLK_USE_TSC_CALIBRATION
 		if (bootverbose)
-			printf("TSC clock: %u Hz (Method B)\n", tsc_freq);
+			printf("TSC clock: %u Hz (method B)\n", tsc_freq);
 #endif
 	}

-#if !defined(SMP)
+#ifdef SMP
 	/*
-	 * We can not use the TSC in SMP mode, until we figure out a
-	 * cheap (impossible), reliable and precise (yeah right!)  way
+	 * We can not use the TSC in SMP mode until we figure out a
+	 * cheap (impossible), reliable and precise (yeah right!) way
 	 * to synchronize the TSCs of all the CPUs.
 	 * Curse Intel for leaving the counter out of the I/O APIC.
 	 */
+	return;
+#endif

 	/*
-	 * We can not use the TSC if we support APM. Precise timekeeping
-	 * on an APM'ed machine is at best a fools pursuit, since
+	 * We can not use the TSC if we support APM.  Precise timekeeping
+	 * on an APM'ed machine is at best a fool's pursuit, since
+	 * APM'ed machine is at best a fools pursuit anyway, since
 	 * any and all of the time spent in various SMM code can't
 	 * be reliably accounted for.  Reading the RTC is your only
-	 * source of reliable time info.  The i8254 looses too of course
-	 * but we need to have some kind of time...
+	 * source of reliable time info.  The i8254 loses too of course
+	 * but we need to have some kind of time.
 	 * We don't know at this point whether APM is going to be used
 	 * or not, nor when it might be activated.  Play it safe.
%%%

The calibration loop in the old code was also axed for the TSC.  A loop
is hacked into the above, and the old code is kept so that I can actually
see if they are working and compare their results for the same boot
(i.e., temperature).  Output looks like this:

%%%
Calibrating clock(s) ... TSC clock: 1532744625 Hz, i8254 clock: 1193121 Hz
Press a key on the console to abort clock calibration
Calibrating clock(s) ... TSC clock: 1532744644 Hz, i8254 clock: 1193121 Hz
Calibrating clock(s) ... TSC clock: 1532744644 Hz, i8254 clock: 1193120 Hz
Calibrating clock(s) ... TSC clock: 1532744644 Hz, i8254 clock: 1193121 Hz
Calibrating clock(s) ... TSC clock: 1532744644 Hz, i8254 clock: 1193120 Hz
Apr  6 22:41:03 besplex last message repeated 3 times
Calibrating clock(s) ... TSC clock: 1532744644 Hz, i8254 clock: 1193121 Hz
Calibrating clock(s) ... TSC clock: 1532744644 Hz, i8254 clock: 1193120 Hz
Calibrating clock(s) ... TSC clock: 1532744644 Hz, i8254 clock: 1193121 Hz
Calibrating clock(s) ... TSC clock: 1532750692 Hz, i8254 clock: 1193125 Hz
Calibrating clock(s) ... TSC clock: 1532744644 Hz, i8254 clock: 1193120 Hz
Apr  6 22:41:03 besplex last message repeated 3 times
Calibrating clock(s) ... TSC clock: 1532750692 Hz, i8254 clock: 1193125 Hz
Calibrating clock(s) ... TSC clock: 1532744644 Hz, i8254 clock: 1193121 Hz
Calibrating clock(s) ... TSC clock: 1532750692 Hz, i8254 clock: 1193125 Hz
Calibrating clock(s) ... TSC clock: 1532744644 Hz, i8254 clock: 1193121 Hz
Calibrating clock(s) ... TSC clock: 1532744644 Hz, i8254 clock: 1193120 Hz
Calibrating clock(s) ... TSC clock: 1532744644 Hz, i8254 clock: 1193121 Hz
Calibrating clock(s) ... TSC clock: 1532744644 Hz, i8254 clock: 1193120 Hz
CLK_USE_I8254_CALIBRATION not specified - using default frequency
Timecounter "i8254"  frequency 1193182 Hz
CLK_USE_TSC_CALIBRATION not specified - using old calibration method
TSC clock: 1532823236 Hz
TSC clock: 1532823868 Hz
raw: 102497236103 102498778864 102498778875 104033144872 104033144883 120895749569
TSC clock: 1532823253 Hz
TSC clock: 1532823868 Hz
raw: 120896993608 120898536184 120898536195 122432902024 122432902035 139295506553
TSC clock: 1532823236 Hz
TSC clock: 1532823868 Hz
raw: 139296717815 139298260408 139298260419 140832626248 140832626259 157695230777
TSC clock: 1532823236 Hz
TSC clock: 1532823868 Hz
raw: 157696400039 157697942632 157697942643 159232308472 159232308483 176094913001
TSC clock: 1532823236 Hz
TSC clock: 1532823868 Hz
raw: 176096077559 176097620152 176097620163 177631985992 177631986003 194494590521
TSC clock: 1532823236 Hz
TSC clock: 1532823868 Hz
raw: 194495761463 194497304056 194497304067 196031669896 196031669907 212894274425
TSC clock: 1532823236 Hz
TSC clock: 1532823868 Hz
raw: 212895446543 212896989136 212896989147 214431354976 214431354987 231293959505
Timecounter "TSC"  frequency 1532823868 Hz
Calibrating clock(s) ... TSC clock: 1532744644 Hz, i8254 clock: 1193121 Hz
%%%

Notes on this output:
- Most of the values are amazingly stable.  The TSC counts are mostly same
  to within a single CPU clock over a period of 10+ seconds!  This despite
  running a fair amount of code and doing millions of bus accesses.
- There is only a small difference between cycle counts for delays of 1
  second (adjusted) and 10 seconds (adjusted and divided by 10).  It
  is insignificant but larger than I like and I can't completely explain
  it.  The raw counts are printed to help debug it.  I first thought
  it was from jitter in the bus accesses, but this is inconsistent
  with the stability of the counts.  Perhaps it is just from different
  overheads for scaling the microseconds counts.
- There is a glitch in the old calibration method that gives a fairly
  consistent 5 i8254 cycle times when it happens.  I don't understand
  this.  On the slower BP6, the variance is larger (about 11 i8254 cycle
  time max) and more evenly distributed.  I don't understand this at all.
  RTC 1-second interrupts seemed to have a much smaller variance that
  when I last tested them a year or 2 ago despite interrupt handling
  having an inherently higher variance than a polling loop (this was
  with non-broken fast interrupts as in RELENG_4).

None of the above matters if you use ntp or have special hardware.
Just use the most efficient working timecounter with starting with
its default frequency and let ntp determine the right frequency.

> but if one is willing to sacrifice a small amount of
> precision for performance, then the TSC is probably better, I think.

It's more than a small amount of precision.  As a practical matter,
ntp can easily compensate for all the normal variance that I have seen
TSCs and RTCs.  My worst cases are approx. 100 ppm errors for the
initial frequency, 20 ppm for daily temperature-related changes and
< 100 ppm for yearly temperature-related changes.  But the errors from
unsynchronized TSCs and TSC throttling are much larger than that.

I found an interesting example (apparently) involving TSC throttling on
Athlons.  I used to use

	pciconf -w -b pci0:0:0 0x95 0x1a		# Idle hack.

This ORs 0x02 into the register.  It makes Athlons with certain VIA
chipsets run about 20 degrees C cooler in the idle loop.  See

	http://vcool.occludo.net/VC_Theory.html

I stopped using this when I noticed that it screws up the TSC.  The
errors are sometimes thousands of ppm.  As a side benefit of not using
this, this reduces the maximum system-activity-related temperature
changes a lot (from about 30 degrees C to about 10 degrees C here) so
it reduces system-activity-related frequency drift.

Bruce


More information about the cvs-src mailing list