instability of timekeeping

Andriy Gapon avg at FreeBSD.org
Tue Oct 20 10:14:15 UTC 2015


I recently replaced a 2-core Athlon II X2 CPU with a same-family Phenom II X4
CPU and after that I started noticing problems with the timekeeping.  It seems
that from time to time the jitter becomes so high that ntpd goes nuts or stops
synchronizing or panics.

Here how the current event timer and time counter configurations look (slightly
trimmed):
$ sysctl kern.timecounter
kern.timecounter.tsc_shift: 1
kern.timecounter.smp_tsc_adjust: 0
kern.timecounter.smp_tsc: 1
kern.timecounter.invariant_tsc: 1
kern.timecounter.fast_gettime: 1
kern.timecounter.tick: 1
kern.timecounter.choice: TSC-low(800) ACPI-fast(900) HPET(950) i8254(0)
dummy(-1000000)
kern.timecounter.hardware: TSC-low
kern.timecounter.alloweddeviation: 5
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.TSC-low.quality: 800
kern.timecounter.tc.TSC-low.frequency: 1607357461
kern.timecounter.tc.TSC-low.counter: 2457319922
kern.timecounter.tc.TSC-low.mask: 4294967295
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.i8254.quality: 0
$ sysctl kern.eventtimer
kern.eventtimer.periodic: 0
kern.eventtimer.timer: HPET
kern.eventtimer.idletick: 0
kern.eventtimer.singlemul: 2
kern.eventtimer.choice: HPET(450) HPET1(450) HPET2(450) LAPIC(400) i8254(100) RTC(0)
kern.eventtimer.et.RTC.quality: 0
kern.eventtimer.et.HPET2.quality: 450
kern.eventtimer.et.HPET1.quality: 450
kern.eventtimer.et.HPET.quality: 450
kern.eventtimer.et.HPET.frequency: 14318180
kern.eventtimer.et.HPET.flags: 3
kern.eventtimer.et.i8254.quality: 100
kern.eventtimer.et.LAPIC.quality: 400

Please note is that TSC-low time counter is chosen administratively whereas the
event timer configuration is fully automatic.
The previous configuration was produced in the same fashion.
One notable difference is that the previous CPU was 2-core and so two HPET
timers were virtually combined into a single timer with per-CPU capability.  In
other words, two HPET timers used two drive two cores.
The newer CPU has four cores, so there are not enough HPET timers to drive each
core independently and thus there is no virtual bundling.  Thus, one HPET timer
drives one core and that core forwards the interrupts to other cores via IPIs as
necessary.

But I am far from sure that the stated difference is actually the source of the
instability.  There could be other hardware-related reasons, of course.

I wonder if there is a good way to analyze / debug this situation to see what
exactly is wrong.  For now I am thinking about trying different time counter and
event timer configurations, but I would prefer a more guided "scientific"
approach over a blind trial and error one.

I would appreciate any help, suggestions, hints.

The CPUs:
CPU: AMD Athlon(tm) II X2 250 Processor (3013.79-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x100f62  Family=0x10  Model=0x6  Stepping=2

Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
  Features2=0x802009<SSE3,MON,CX16,POPCNT>
  AMD Features=0xee500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!>
  AMD
Features2=0x37ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT>
  SVM: Features=0xf<NP,LbrVirt,SVML,NRIPS>
Revision=1, ASIDs=64
  TSC: P-state invariant

CPU: AMD Phenom(tm) II X4 955 Processor (3214.71-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x100f43  Family=0x10  Model=0x4  Stepping=3

Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
  Features2=0x802009<SSE3,MON,CX16,POPCNT>
  AMD Features=0xee500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!>
  AMD
Features2=0x37ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT>
  SVM: Features=0xf<NP,LbrVirt,SVML,NRIPS>
Revision=1, ASIDs=64
  TSC: P-state invariant

-- 
Andriy Gapon


More information about the freebsd-hackers mailing list