svn commit: r221703 - in head/sys: amd64/include i386/include x86/isa x86/x86

Bruce Evans brde at optusnet.com.au
Fri May 13 19:14:23 UTC 2011


On Fri, 13 May 2011, Jung-uk Kim wrote:

> On Friday 13 May 2011 08:47 am, Andriy Gapon wrote:
>> on 12/05/2011 19:39 Jung-uk Kim said the following:
>>> Actually, I am kinda reluctant to enable smp_tsc by default on
>>> recent CPUs.  Although they made all TSCs in sync, it is very
>>> very tricky to make it work in reality, e.g.,
>>>
>>> https://patchwork.kernel.org/patch/691712/
>>
>> I am not sure what is their concern there.
>> TSC is good to be used as timecounter.
>
> *Iff* they are all in sync. and atomically increasing...

Not even that.  rdtsc is non-serializing, so the TSCs causality may
be violated by different reordering of rdtsc on different CPUs.
Apparently this happens in practice.  I think it would not happen for
"rdtsc; rdtsc" in a single thread, since the context switch to execute
the rdtsc's on different CPUs would take a long time amd would probably
execute some serializing instructions.  It might happen for clock_gettime()
in separate threads where the threads somehow know and depend on the
order of the calls.  Shared variables seems to be needed for knowing
this, and I don't know how the variables could be accessed atomically
enough without serializing the rdtsc's.  Apart from that, the code might
be:

         thread 1                thread 2
         --------                --------
         start1 = gen++;         start2 = gen++;
         clock_gettime(...);     clock_gettime(...);
         end1 = gen++;           end2 = gen++;

We can hope that first clock_gettime() executed entirely before before
the second one if end1 < start2.  But without any serialization instructions,
the rdtsc's aren't guaranteed to execute between the stores to the variables,
so knowing the order of the stores tells us nothing about the order of the
rdtsc's.

>> If they use TSC for performance measurements, then of course they
>> have to use some barriers - this is well known and documented.

Also for timecounters.

> If my understanding is correct, Linux has to make sure the new timer
> value read from a CPU must be written/read to/from memory in order
> and all other CPUs must be able see the updated value as their
> "vsyscall" and/or "vDSO" version of gettimeofday(2) and friends rely
> on it.  Also, the last value read from a CPU is kept in memory and
> compared with a new value (possibly read from another CPU) to make
> sure it is incremental.  I'd call it a "TSC-safe" timecounter. ;-)
> Some price to pay when you do timekeeping in user space to avoid
> syscalls...

Hmm, timecounter code intentionally doesn't store the last value read
to a shared (kernel) variable due to the cost of doing so, although
this causes bugs like times read by the "get*()" interfaces being
incoherent with times read by the non-"get*()" interfaces.

Perhaps the problem is only visible with the userland implementation,
since clock_gettime() is too slow to give it via code like the above
if it is a syscall (the rdtsc will be in the middle somewhere and there
is plently of time for it to complete before the stores in userland).
The Linux discussion says that a change optimizes clock_gettime() from
22 ns to 17 ns on Sandybridge.  20 ns is about the time for a single
rdtsc.  On AthlonXP, the best I've seen for the FreeBSD syscall is
about 250 ns (550 cycles), despite rdtsc only taking 12 cycles on
AthlonXP (rdtsc takes more like 40-80 cycles on newer CPUs, for the
hardware part of its synchronization :-().  Timecounter calls using
thr TSC timecounter in the FreeBSD kernel take only about 50 cycles
(when rdtsc takes only 12 cycles), but syscalls add a lot.

>> BTW, newer CPUs provide RDTSCP instruction which could be more
>> convenient there.
>
> AFAIK, some people wanted to do that but Linus thought RDTSCP is too
> expensive as it is a serialized instruction.

The whole point of the Linux discussion is to reduce the synchronization
that they already have (a couple of fence instructions).

Bruce


More information about the svn-src-head mailing list