svn commit: r238755 - head/sys/x86/x86

Konstantin Belousov kostikbel at gmail.com
Wed Jul 25 10:21:37 UTC 2012


On Wed, Jul 25, 2012 at 10:20:02AM +0300, Andriy Gapon wrote:
> on 25/07/2012 01:10 Jim Harris said the following:
> > Author: jimharris
> > Date: Tue Jul 24 22:10:11 2012
> > New Revision: 238755
> > URL: http://svn.freebsd.org/changeset/base/238755
> > 
> > Log:
> >   Add rmb() to tsc_read_##x to enforce serialization of rdtsc captures.
> >   
> >   Intel Architecture Manual specifies that rdtsc instruction is not serialized,
> >   so without this change, TSC synchronization test would periodically fail,
> >   resulting in use of HPET timecounter instead of TSC-low.  This caused
> >   severe performance degradation (40-50%) when running high IO/s workloads due to
> >   HPET MMIO reads and GEOM stat collection.
> >   
> >   Tests on Xeon E5-2600 (Sandy Bridge) 8C systems were seeing TSC synchronization
> >   fail approximately 20% of the time.
> 
> Should rather the synchronization test be fixed if it's the culprit?
Synchronization test for what ?

> Or is this change universally good for the real uses of TSC?

What I understood from the Intel SDM, and also from additional experiments
which Jim kindly made despite me being annoying as usual, is that 'read
memory barrier' AKA LFENCE there is used for its secondary implementation
effects, not for load/load barrier as you might assume.

According to SDM, LFENCE fully drains execution pipeline (but comparing
with MFENCE, does not drain write buffers). The result is that RDTSC is
not started before previous instructions are finished.

For tsc test, this means that after the change RDTSC executions are not
reordered on the single core among themself. As I understand, CPU has
no dependency noted between two reads of tsc by RDTSC, which allows
later read to give lower value of counter. This is fixed by Intel by
introduction of RDTSCP instruction, which is defined to be serialization
point, and use of which (instead of LFENCE; RDTSC sequence) also fixes
test, as confirmed by Jim.

In fact, I now think that we should also apply the following patch.
Otherwise, consequtive calls to e.g. binuptime(9) could return decreased
time stamps. Note that libc __vdso_gettc.c already has LFENCE nearby the
tsc reads, which was done not for this reason, but apparently needed for
the reason too.

diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c
index 085c339..229b351 100644
--- a/sys/x86/x86/tsc.c
+++ b/sys/x86/x86/tsc.c
@@ -594,6 +594,7 @@ static u_int
 tsc_get_timecount(struct timecounter *tc __unused)
 {
 
+	rmb();
 	return (rdtsc32());
 }
 
@@ -602,8 +603,9 @@ tsc_get_timecount_low(struct timecounter *tc)
 {
 	uint32_t rv;
 
+	rmb();
 	__asm __volatile("rdtsc; shrd %%cl, %%edx, %0"
-	: "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx");
+	    : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx");
 	return (rv);
 }
 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/svn-src-all/attachments/20120725/04888abf/attachment.pgp


More information about the svn-src-all mailing list