Atomic operations on i386/amd64
Scott Long
scottl at samsco.org
Thu Aug 5 15:17:42 PDT 2004
John Baldwin wrote:
> On Thursday 05 August 2004 01:04 am, Tim Robbins wrote:
>
>>Is there any particular reason why atomic_load_acq_*() and
>>atomic_store_rel_*() are implemented with CMPXCHG and XCHG instead of
>>MOV on i386/amd64 UP?
>
>
> Actually, using mov instead of lock xchg for store_rel reduced performance in
> some benchmarks Scott ran on an SMP machine, I'm guessing due to the higher
> latency of locks becoming available to other CPUs. I'm still waiting for
> benchmark results on UP to see if the change should be made under #ifndef SMP
> or some such.
>
>
>>Also, could we use MFENCE/LFENCE/SFENCE in combination with MOV on
>>SMP systems instead of LOCK CMPXCHG / (implied LOCK) XCHG?
>
>
> MFENCE and LFENCE only exist on the P4. SFENCE only exists on P3+, so to do
> so you'd lose the ability to run on PII's and earlier. Also, if you use more
> than SFENCE you lose PIII's. Note that amd64 could probably be changed
> though since they might all have fences, in which case that might be
> something to benchmark on both UP and SMP to see what kind of difference it
> makes.
>
We always have the ability to define PENTIUM2_CPU, PENTIUM3_CPU, and
PENTIUM4_CPU cpu types in the kernel and then ifdef the code
appropriately (and ship with the lowest common denominator like we do
for I386/I486/I586/I686.)
Scott
More information about the freebsd-current
mailing list