Atomic operations on i386/amd64

Thu Aug 5 15:17:42 PDT 2004

John Baldwin wrote:
> On Thursday 05 August 2004 01:04 am, Tim Robbins wrote:
> 
>>Is there any particular reason why atomic_load_acq_*() and
>>atomic_store_rel_*() are implemented with CMPXCHG and XCHG instead of
>>MOV on i386/amd64 UP?
> 
> 
> Actually, using mov instead of lock xchg for store_rel reduced performance in 
> some benchmarks Scott ran on an SMP machine, I'm guessing due to the higher 
> latency of locks becoming available to other CPUs.  I'm still waiting for 
> benchmark results on UP to see if the change should be made under #ifndef SMP 
> or some such.
> 
> 
>>Also, could we use MFENCE/LFENCE/SFENCE in combination with MOV on
>>SMP systems instead of LOCK CMPXCHG / (implied LOCK) XCHG?
> 
> 
> MFENCE and LFENCE only exist on the P4.  SFENCE only exists on P3+, so to do 
> so you'd lose the ability to run on PII's and earlier.  Also, if you use more 
> than SFENCE you lose PIII's.  Note that amd64 could probably be changed 
> though since they might all have fences, in which case that might be 
> something to benchmark on both UP and SMP to see what kind of difference it 
> makes.
> 

We always have the ability to define PENTIUM2_CPU, PENTIUM3_CPU, and 
PENTIUM4_CPU cpu types in the kernel and then ifdef the code
appropriately (and ship with the lowest common denominator like we do
for I386/I486/I586/I686.)

Scott