svn commit: r346593 - head/sys/sys

Fri Apr 26 10:22:36 UTC 2019

On Fri, Apr 26, 2019 at 08:04:29PM +1000, Bruce Evans wrote:
> On Fri, 26 Apr 2019, Mark Johnston wrote:
> 
> > On Fri, Apr 26, 2019 at 10:38:36AM +0300, Konstantin Belousov wrote:
> >> On Fri, Apr 26, 2019 at 02:04:56AM -0400, Mark Johnston wrote:
> >>> On Thu, Apr 25, 2019 at 11:22:22AM +0300, Konstantin Belousov wrote:
> >>>> On Thu, Apr 25, 2019 at 07:38:21AM +0200, Wojciech Macek wrote:
> >>>>> Intel does not reorder reads against the condition "if" here. I know for
> >>>>> sure that ARM does, but therestill might be some other architectures that
> >>>>> also suffers such behavior - I just don't have any means to verify.
> >>>>> I remember the discussion for rS302292 where we agreed that this kind of
> >>>>> patches should be the least impacting in perfomrance as possible. Adding
> >>>>> unconditional memory barrier causes significant performance drop on Intel,
> >>>>> where in fact, the issue was never seen.
> >>>>>
> >>>> Atomic_thread_fence_acq() is nop on x86, or rather, it is compiler memory
> >>>> barrier.  If you need read/read fence on some architectures, I am sure
> >>>> that you need compiler barrier on all.
> >>>
> >>> To add a bit, one reason to prefer atomic(9) to explicit fences is
> >>> precisely because it issues fences only when required by a given
> >>> CPU architecture.  There is no "unconditional memory barrier" added by
> >>> the diff even without the #ifdef.
> >> Well, atomic_thread_fence_acq() is the explicit fence.  And on x86 it
> >> does add unconditional compiler memory barrier.
> >
> > I only mean that with atomic_thread_fence_acq() on x86, the CPU does not
> > see any fences.
> >
> > Based on the original commit it seems that a compiler barrier is
> > required on all platforms, at a minimum.
> 
> buf_ring.h has some volatile variables which might give sufficient barriers.
> But no one knows what volatile does, so reasoning about it is even harder
> than reasoning about ordering from atomic ops.  I think the volatiles give
> program order for the volatile variables only (plus ordering of other variables
> from dependencies on the volatile variables), while the compiler barrier
> gives program order for all variables.

No, volatile does not give any ordering. For gcc-like compilers,
documentation implies that the volatile accesses are guarenteed to
occur, i.e. they cannot be optimized out. We use volatiles to implement
relaxed atomics in atomic(9) API.

For Java, volatile reads have acquire semantic, and volatile writes are
releases.