[PATCH] Statclock aliasing by LAPIC

Tue Jan 19 19:19:54 UTC 2010

On Tue, 19 Jan 2010, Attilio Rao wrote:

> 2010/1/19 Scott Long <scottl at samsco.org>:
>> On Jan 19, 2010, at 10:27 AM, Attilio Rao wrote:
>>>
>>> 2010/1/19 John Baldwin <jhb at freebsd.org>:
>>>> My feeling, btw, is that the real solution is to not use a sampling clock
>>>> for
>>>> per-process stats, but to just use the cycle counter and keep separate
>>>> user,
>>>> system, and interrupt cycle counts (like the rux_runtime we have now).
>>>> Â This
>>>> makes calcru() trivial and eliminates many of the weird "going
>>>> backwards",
>>>> etc. problems. Â The only issue with this approach is that not all
>>>> platforms
>>>> have a cheap cycle counter (many embedded platforms lack one I think), so
>>>> you
>>>> would almost need to support both modes of operation and maybe have an
>>>> #define
>>>> in <machine/param.h> to choose between the two modes.
>>>
>>> Generally that would be a good idea, but the problem is not only for
>>> the architectures not supporting it, but also for architectures that
>>> do (eg. TSC de-synchronization in some SMP environment).
>>>
>>
>> For process stats, TSC desync isn't a big problem. Â As a process migrates
>> from one CPU to the other, its stats from the old cpu will be recorded, then
>> stats will be started on the new cpu. Â The only problem here is with
>> normalizing the different TSC's to a common reference. Â Maybe that can be
>> done when computing cp_times? Â This is definitely a case where 'perfect' is
>> the enemy of 'a hell of a lot better than we have now'.
>

Only the frequencies would need normalization, since the TSCs are per-CPU
and they hopefully don't get reset by suspend etc.  Separate frequencies
for separate CPUs are not supported now.

> I wouldn't like to be mistaken, but IIRC in some benchmarks kris@ did
> in the past years we were seeing TSC timers litterally going backwards
> after the de-synchronization (even on absolute measurement).

Do you really mean individual TSCs going backwards?  P-state-invariance
(?) should prevent the desync.  If the TSCs actually desync, then TSC
timecounters are sure to break, with timecounters going backwards being
a typical result (certain calculations overflow if time deltas are
unexpectedly large).  Timecounters used to be used for the equivalent
of rux_runtime.  There were/are no checks for timecounters themselves
going backwards, but sanity checks in the use of rux_runtime detected
this.  Now TSCs (if available) are normally used for rux_runtime.
Recalibration of the TSC's assumed-common frequency is buggy and can
easily cause bizarre user times when the frequency is changed.

Apart from that, rux_runtime is correct.  Good enough for scheduling
even when incorrect.

Bruce