svn commit: r285854 - head/sys/amd64/include

Sat Jul 25 18:03:55 UTC 2015

On 07/24/2015 16:15, John-Mark Gurney wrote:
> Alan Cox wrote this message on Fri, Jul 24, 2015 at 19:43 +0000:
>> Author: alc
>> Date: Fri Jul 24 19:43:18 2015
>> New Revision: 285854
>> URL: https://svnweb.freebsd.org/changeset/base/285854
>>
>> Log:
>>   Add a comment discussing the appropriate use of the atomic_*() functions
>>   with acquire and release semantics versus the *mb() functions on amd64
>>   processors.
> Please put this documentation in the atomic(9) man page where it is
> easier to read and access...  it's probably best to just move it
> there and reference atomic(9) here...
>
> Also, this advice isn't amd64 specific is it?  If it isn't, why is it
> in an amd64 include file?

While the first sentence is not amd64 specific, the core of this
paragraph, the third, four, and fifth sentences, is very amd64 specific.
In particular, the redundancy of the rmb() and wmb() functions for
ordinary cases of interprocessor memory ordering is not generally true
across architectures that we support.  For example, on arm64 or powerpc,
these functions do provide non-redundant ordering.

But, I do agree that the first sentence also belongs in a man page, like
atomic(9).  Today, however, we have no man page documenting the *mb()
functions.

>> Modified:
>>   head/sys/amd64/include/atomic.h
>>
>> Modified: head/sys/amd64/include/atomic.h
>> ==============================================================================
>> --- head/sys/amd64/include/atomic.h	Fri Jul 24 19:37:30 2015	(r285853)
>> +++ head/sys/amd64/include/atomic.h	Fri Jul 24 19:43:18 2015	(r285854)
>> @@ -32,6 +32,25 @@
>>  #error this file needs sys/cdefs.h as a prerequisite
>>  #endif
>>  
>> +/*
>> + * To express interprocessor (as opposed to processor and device) memory
>> + * ordering constraints, use the atomic_*() functions with acquire and release
>> + * semantics rather than the *mb() functions.  An architecture's memory
>> + * ordering (or memory consistency) model governs the order in which a
>> + * program's accesses to different locations may be performed by an
>> + * implementation of that architecture.  In general, for memory regions
>> + * defined as writeback cacheable, the memory ordering implemented by amd64
>> + * processors preserves the program ordering of a load followed by a load, a
>> + * load followed by a store, and a store followed by a store.  Only a store
>> + * followed by a load to a different memory location may be reordered.
>> + * Therefore, except for special cases, like non-temporal memory accesses or
>> + * memory regions defined as write combining, the memory ordering effects
>> + * provided by the sfence instruction in the wmb() function and the lfence
>> + * instruction in the rmb() function are redundant.  In contrast, the
>> + * atomic_*() functions with acquire and release semantics do not perform
>> + * redundant instructions for ordinary cases of interprocessor memory
>> + * ordering on any architecture.
>> + */
>>  #define	mb()	__asm __volatile("mfence;" : : : "memory")
>>  #define	wmb()	__asm __volatile("sfence;" : : : "memory")
>>  #define	rmb()	__asm __volatile("lfence;" : : : "memory")