proposed smp_rendezvous change

Max Laier max at love2party.net
Fri May 13 14:54:51 UTC 2011


On Friday 13 May 2011 09:43:25 Andriy Gapon wrote:
> This is a change in vein of what I've been doing in the xcpu branch and
> it's supposed to fix the issue by the recent commit that (probably
> unintentionally) stress-tests smp_rendezvous in TSC code.
> 
> Non-essential changes:
> - ditch initial, and in my opinion useless, pre-setup rendezvous in
> smp_rendezvous_action()
> - shift smp_rv_waiters indexes by one because of the above
> 
> Essential changes (the fix):
> - re-use freed smp_rv_waiters[2] to indicate that a slave/target is really
> fully done with rendezvous (i.e. it's not going to access any members of
> smp_rv_* pseudo-structure)
> - spin on smp_rv_waiters[2] upon _entry_ to smp_rendezvous_cpus() to not
> re-use the smp_rv_* pseudo-structure too early
> 
> Index: sys/kern/subr_smp.c
> ===================================================================
> --- sys/kern/subr_smp.c	(revision 221835)
> +++ sys/kern/subr_smp.c	(working copy)
> @@ -316,19 +316,14 @@
>  	void (*local_action_func)(void*)   = smp_rv_action_func;
>  	void (*local_teardown_func)(void*) = smp_rv_teardown_func;
> 
> -	/* Ensure we have up-to-date values. */
> -	atomic_add_acq_int(&smp_rv_waiters[0], 1);
> -	while (smp_rv_waiters[0] < smp_rv_ncpus)
> -		cpu_spinwait();
> -
>  	/* setup function */
>  	if (local_setup_func != smp_no_rendevous_barrier) {
>  		if (smp_rv_setup_func != NULL)
>  			smp_rv_setup_func(smp_rv_func_arg);
> 
>  		/* spin on entry rendezvous */
> -		atomic_add_int(&smp_rv_waiters[1], 1);
> -		while (smp_rv_waiters[1] < smp_rv_ncpus)
> +		atomic_add_int(&smp_rv_waiters[0], 1);
> +		while (smp_rv_waiters[0] < smp_rv_ncpus)
>                  	cpu_spinwait();
>  	}
> 
> @@ -337,12 +332,16 @@
>  		local_action_func(local_func_arg);
> 
>  	/* spin on exit rendezvous */
> -	atomic_add_int(&smp_rv_waiters[2], 1);
> -	if (local_teardown_func == smp_no_rendevous_barrier)
> +	atomic_add_int(&smp_rv_waiters[1], 1);
> +	if (local_teardown_func == smp_no_rendevous_barrier) {
> +		atomic_add_int(&smp_rv_waiters[2], 1);
>                  return;
> -	while (smp_rv_waiters[2] < smp_rv_ncpus)
> +	}
> +	while (smp_rv_waiters[1] < smp_rv_ncpus)
>  		cpu_spinwait();
> 
> +	atomic_add_int(&smp_rv_waiters[2], 1);
> +
>  	/* teardown function */
>  	if (local_teardown_func != NULL)
>  		local_teardown_func(local_func_arg);
> @@ -377,6 +376,10 @@
>  	/* obtain rendezvous lock */
>  	mtx_lock_spin(&smp_ipi_mtx);
> 
> +	/* Wait for any previous unwaited rendezvous to finish. */
> +	while (atomic_load_acq_int(&smp_rv_waiters[2]) < ncpus)

this ncpus isn't the one you are looking for.  I have a patch of my own that 
is attached.  This might be overdoing it to some extend, but it has been 
running very well on our system for quite some time, which now uses rmlock 
heavily.

The rmlock diff is unrelated, but since I have this diff around for some time 
now ...

> +		cpu_spinwait();
> +
>  	/* set static function pointers */
>  	smp_rv_ncpus = ncpus;
>  	smp_rv_setup_func = setup_func;
> @@ -395,7 +398,7 @@
>  		smp_rendezvous_action();
> 
>  	if (teardown_func == smp_no_rendevous_barrier)
> -		while (atomic_load_acq_int(&smp_rv_waiters[2]) < ncpus)
> +		while (atomic_load_acq_int(&smp_rv_waiters[1]) < ncpus)
>  			cpu_spinwait();
> 
>  	/* release lock */
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bug71970.diff
Type: text/x-patch
Size: 7617 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-current/attachments/20110513/2617f452/bug71970.bin


More information about the freebsd-current mailing list