svn commit: r221703 - in head/sys: amd64/include i386/include x86/isa x86/x86

Thu May 12 11:23:07 UTC 2011

on 09/05/2011 20:34 Jung-uk Kim said the following:
> Author: jkim
> Date: Mon May  9 17:34:00 2011
> New Revision: 221703
> URL: http://svn.freebsd.org/changeset/base/221703
[snip]

I would to note [again] that I don't like code style of this change.

> Modified: head/sys/x86/x86/tsc.c
> ==============================================================================
> --- head/sys/x86/x86/tsc.c	Mon May  9 17:30:25 2011	(r221702)
> +++ head/sys/x86/x86/tsc.c	Mon May  9 17:34:00 2011	(r221703)
> @@ -326,7 +326,73 @@ init_TSC(void)
>  	    tsc_levels_changed, NULL, EVENTHANDLER_PRI_ANY);
>  }
>  
> -void
> +#ifdef SMP
> +
> +#define	TSC_READ(x)			\
> +static void				\
> +tsc_read_##x(void *arg)			\
> +{					\
> +	uint32_t *tsc = arg;		\
> +	u_int cpu = PCPU_GET(cpuid);	\
> +					\
> +	tsc[cpu * 3 + x] = rdtsc32();	\
> +}
> +TSC_READ(0)
> +TSC_READ(1)
> +TSC_READ(2)
> +#undef TSC_READ
> +
> +#define	N	1000

I don't like macro overuse in the above.
Especially "N".

> +static void
> +comp_smp_tsc(void *arg)
> +{
> +	uint32_t *tsc;
> +	int32_t d1, d2;
> +	u_int cpu = PCPU_GET(cpuid);
> +	u_int i, j, size;
> +
> +	size = (mp_maxid + 1) * 3;
> +	for (i = 0, tsc = arg; i < N; i++, tsc += size)
> +		CPU_FOREACH(j) {
> +			if (j == cpu)
> +				continue;
> +			d1 = tsc[cpu * 3 + 1] - tsc[j * 3];
> +			d2 = tsc[cpu * 3 + 2] - tsc[j * 3 + 1];
> +			if (d1 <= 0 || d2 <= 0) {
> +				smp_tsc = 0;
> +				return;
> +			}
> +		}
> +}
> +
> +static int
> +test_smp_tsc(void)
> +{
> +	uint32_t *data, *tsc;
> +	u_int i, size;
> +
> +	if (!smp_tsc && !tsc_is_invariant)
> +		return (-100);
> +	size = (mp_maxid + 1) * 3;
> +	data = malloc(sizeof(*data) * size * N, M_TEMP, M_WAITOK);
> +	for (i = 0, tsc = data; i < N; i++, tsc += size)
> +		smp_rendezvous(tsc_read_0, tsc_read_1, tsc_read_2, tsc);

I don't like that what is logically a two dimensional array 3 x (mp_maxid + 1) is
represented as a one-dimensional array with all ensuing multiplications by three
and other array index manipulations.

> +	smp_tsc = 1;	/* XXX */
> +	smp_rendezvous(smp_no_rendevous_barrier, comp_smp_tsc,
> +	    smp_no_rendevous_barrier, data);
> +	free(data, M_TEMP);
> +	if (bootverbose)
> +		printf("SMP: %sed TSC synchronization test\n",
> +		    smp_tsc ? "pass" : "fail");
> +	return (smp_tsc ? 800 : -100);

I still think something higher should be returned here for the smp_tsc == true
case.  It doesn't make sense to go through the shenanigans to underrate TSC in the
end.

On a more general note, smp_rendezvous() might not be the best mechanism here.
In my opinion/understanding, smp_rendezvous() provides only the following guarantees:
- if a setup action is specified, then every CPU executes the setup action before
any CPU executes the main action
- if a teardown action is specified, then every CPU executes the main action
before any CPU executes the teardown action

There are no timing guarantees, only the sequence guarantees.
Any timing observations that we may have now are a product of the implementation
and can change if the implementation change.  So the newly introduced check may
miss systemic differences in TSC values between CPUs if they are small enough.
But I am not really sure if such a small differences would really matter.

Worse case if there is some difference in TSC frequency between CPUs (e.g. in
different sockets), after a powerup or a reset difference between TSC values may
be very small, but could grow more and more with uptime.  Not sure if this is a
realistic enough scenario, though.

-- 
Andriy Gapon