[PATCH] Statclock aliasing by LAPIC

Tue Jan 19 17:41:25 UTC 2010

2010/1/19 Scott Long <scottl at samsco.org>:
> On Jan 19, 2010, at 10:27 AM, Attilio Rao wrote:
>>
>> 2010/1/19 John Baldwin <jhb at freebsd.org>:
>>>
>>> On Saturday 16 January 2010 7:09:38 am Attilio Rao wrote:
>>>>
>>>> 2010/1/16 Bruce Evans <brde at optusnet.com.au>:
>>>>>
>>>>> On Fri, 15 Jan 2010, Attilio Rao wrote:
>>>>>
>>>>>> I still see clock_lock in place (and no particular critical section
>>>>>> code in that paths) or you meant to say that the clock_lock doesn't
>>>>>> still provide enough protection alone?
>>>>>> BTW, you were right about the lapic_timer_hz (I forgot to revert to
>>>>>> hz). There is an updated patch:
>>>>>>
>>>>>>
>>>
>>> http://www.freebsd.org/~attilio/Sandvine/STABLE_8/statclock_aliasing/statclock_aliasing4.diff
>>>>>
>>>>> It seems to have the same fundamental bugs as the previous version.
>>>>> The atrtc interrupt is too slow to use for anything, so it should never
>>>>> be used if there is something better like the lapic timer available
>>>>> (even the i8254 is better), and using it here doesn't even fix the
>>>>> problem (malicious applications can very easily hide from statclock
>>>>> by default since the default hz is much larger than the default stathz,
>>>>> and malicious applications can not so easily hide from statclock
>>>>> irrespective
>>>>> of the misconfiguration of hz, since statclock is not random).  See my
>>>>> previous reply and
>>>>> ftp://ftp.ee.lbl.gov/papers/statclk-usenix93.ps.Z for
>>>>> more details.
>>>>
>>>> Well, the primary things I wanted to fix is not the hiding of
>>>> malicious programs but the clock aliasing created when handling all
>>>> the clocks by the same source.
>>>> About the slowness -- I'm fine with whatever additional source to
>>>> LAPIC we would eventually use thus would you feel better if i8254 is
>>>> used replacing atrtc?
>>>> Also note that atrtc is the default if LAPIC cannot be used. I don't
>>>> understand why another source, even simpler (eg. i8254) would have
>>>> been used in that specific case by the 'old' code.
>>>>
>>>> What I mean, then is: I see your points, I'm not arguing that at all,
>>>> but the old code has other problems that gets fixed with this patch
>>>> (having different sources make the whole system more flexible) while
>>>> the new things it does introduce are secondarilly (but still: I'm fine
>>>> with whatever second source is picked up for statclock, profclock) if
>>>> you really see a concern wrt atrtc slowness.
>>>
>>> You can't use the i8254 reliable with APIC enabled.  Some motherboards
>>> don't
>>> actually hook up IRQ 0 to pin 2.  We used to support this by enabling IRQ
>>> 0 in
>>> the atpic and enabling the ExtINT pin to use both sets of PICs in tandem.
>>> However, this was very gross and had its own set of issues, so we removed
>>> the
>>> support for "mixed mode" a while ago.  Also, the ACPI specification
>>> specifically forbids an OS from using "mixed mode".
>>>
>>> My feeling, btw, is that the real solution is to not use a sampling clock
>>> for
>>> per-process stats, but to just use the cycle counter and keep separate
>>> user,
>>> system, and interrupt cycle counts (like the rux_runtime we have now).
>>>  This
>>> makes calcru() trivial and eliminates many of the weird "going
>>> backwards",
>>> etc. problems.  The only issue with this approach is that not all
>>> platforms
>>> have a cheap cycle counter (many embedded platforms lack one I think), so
>>> you
>>> would almost need to support both modes of operation and maybe have an
>>> #define
>>> in <machine/param.h> to choose between the two modes.
>>
>> Generally that would be a good idea, but the problem is not only for
>> the architectures not supporting it, but also for architectures that
>> do (eg. TSC de-synchronization in some SMP environment).
>>
>
> For process stats, TSC desync isn't a big problem.  As a process migrates
> from one CPU to the other, its stats from the old cpu will be recorded, then
> stats will be started on the new cpu.  The only problem here is with
> normalizing the different TSC's to a common reference.  Maybe that can be
> done when computing cp_times?  This is definitely a case where 'perfect' is
> the enemy of 'a hell of a lot better than we have now'.

I wouldn't like to be mistaken, but IIRC in some benchmarks kris@ did
in the past years we were seeing TSC timers litterally going backwards
after the de-synchronization (even on absolute measurement).

Attilio

-- 
Peace can only be achieved by understanding - A. Einstein