Atomic operations on i386/amd64

Thu Aug 5 15:01:27 PDT 2004

On Thursday 05 August 2004 01:04 am, Tim Robbins wrote:
> Is there any particular reason why atomic_load_acq_*() and
> atomic_store_rel_*() are implemented with CMPXCHG and XCHG instead of
> MOV on i386/amd64 UP?

Actually, using mov instead of lock xchg for store_rel reduced performance in 
some benchmarks Scott ran on an SMP machine, I'm guessing due to the higher 
latency of locks becoming available to other CPUs.  I'm still waiting for 
benchmark results on UP to see if the change should be made under #ifndef SMP 
or some such.

> Also, could we use MFENCE/LFENCE/SFENCE in combination with MOV on
> SMP systems instead of LOCK CMPXCHG / (implied LOCK) XCHG?

MFENCE and LFENCE only exist on the P4.  SFENCE only exists on P3+, so to do 
so you'd lose the ability to run on PII's and earlier.  Also, if you use more 
than SFENCE you lose PIII's.  Note that amd64 could probably be changed 
though since they might all have fences, in which case that might be 
something to benchmark on both UP and SMP to see what kind of difference it 
makes.

-- 
John Baldwin <jhb at FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org