svn commit: r213648 - head/sys/kern
John Baldwin
jhb at freebsd.org
Wed Oct 13 20:03:11 UTC 2010
On Wednesday, October 13, 2010 12:21:44 pm Gennady Proskurin wrote:
> Thank you and Bruce for explanation.
> The key point here (which I did not understand) is that
> "something == PCPU_GET(cpuid))" may be true only if "something" is set by the
> current cpu, in which case its value is not stale.
Yes. mtx_owned() (and mtx_assert()) depend on a similar "feature" in that
stale values don't hurt.
> On Wed, Oct 13, 2010 at 09:05:04AM -0400, John Baldwin wrote:
> > On Tuesday, October 12, 2010 5:08:04 pm Gennady Proskurin wrote:
> > > On Sat, Oct 09, 2010 at 12:48:50PM +0300, Andriy Gapon wrote:
> > > > on 09/10/2010 12:33 Bruce Evans said the following:
> > > > > On Sat, 9 Oct 2010, Andriy Gapon wrote:
> > > > >
> > > > >> Log:
> > > > >> panic_cpu variable should be volatile
> > > > >>
> > > > >> This is to prevent caching of its value in a register when it is checked
> > > > >> and modified by multiple CPUs in parallel.
> > > > >> Also, move the variable into the scope of the only function that uses it.
> > > > >>
> > > > >> Reviewed by: jhb
> > > > >> Hint from: mdf
> > > > >> MFC after: 1 week
> > > > >
> > > > > I doubt that this is either necessary or sufficient. Most variables
> > > > > aren't volatile while they are locked by a mutex. But panic() cannot use
> > > > > normal mutex locking, so it should access this variable using atomic
> > > > > ops with memory barriers. The compiler part of the memory barriers
> > > > > effectively make _all_ variables temporily volatile. However, 2 of the
> > > > > the 4 accesses to this variable doesn't use an atomic op.
> > > > >
> > > > > % #ifdef SMP
> > > > > % /*
> > > > > % * We don't want multiple CPU's to panic at the same time, so we
> > > > > % * use panic_cpu as a simple spinlock. We have to keep checking
> > > > > % * panic_cpu if we are spinning in case the panic on the first
> > > > > % * CPU is canceled.
> > > > > % */
> > > > > % if (panic_cpu != PCPU_GET(cpuid))
> > > > >
> > > > > This access doesn't use an atomic op.
> > > > >
> > > > > % while (atomic_cmpset_int(&panic_cpu, NOCPU,
> > > > > % PCPU_GET(cpuid)) == 0)
> > > > >
> > > > > This access uses an atomic op. Not all atomic ops use memory barriers,
> > > > > at least on i386. However, this one does, at least on i386.
> > > > >
> > > > > (I'm always confused about what atomic_any() without acq or release
> > > > > means. Do they mean that you don't want a memory barrier (this is
> > > > > what the mostly give, at least on i386), and if so, what use are
> > > > > they? There are far too many atomic ops, for far too many never-used
> > > > > widths, with alternative spellings to encourage using a wrong one.
> > > > > cmpset is is especially confusing since it you can spell it as cmpset,
> > > > > cmpset_acq or compset_rel and always get the barrier, at least on
> > > > > i386. At least cmpset only supports 1 width, at least on i386.)
> > > > >
> > > > > % while (panic_cpu != NOCPU)
> > > > >
> > > > > This access doesn't use an atomic op.
> > > > >
> > > > > % ; /* nothing */
> > > > > % #endif
> > > > > % ...
> > > > > % #ifdef RESTARTABLE_PANICS
> > > > > % /* See if the user aborted the panic, in which case we continue. */
> > > > > % if (panicstr == NULL) {
> > > > > % #ifdef SMP
> > > > > % atomic_store_rel_int(&panic_cpu, NOCPU);
> > > > >
> > > > > This access uses an atomic op with a memory barrier, at least on i386.
> > > > > Now its rel semantics are clear.
> > > > >
> > > > > panicstr is non-volatile and never accessed by an atomic op, so it probably
> > > > > strictly needs to be declared volatile even more than panic_cpu. I think
> > > >
> > > > I agree about panicstr.
> > > > But I am not sure if we have places in code (beyond panic function) itself where
> > > > volatile vs. non-volatile would make any actual difference.
> > > > But still.
> > > >
> > > > > the only thing that makes it work now is that it is bogusly pubic, and
> > > > > compilers can't analyze the whole program yet -- if they could, then they
> > > > > would see that it is only set in panic().
> > > > >
> > > > > % #endif
> > > > > % return;
> > > > > % }
> > > > > % #endif
> > > > > % #endif
> > > > >
> > > > > Now, why don't the partial memory barriers prevent caching the variable?
> > > > >
> > > > > % if (panic_cpu != PCPU_GET(cpuid))
> > > > > % while (atomic_cmpset_int(&panic_cpu, NOCPU,
> > > > > % PCPU_GET(cpuid)) == 0)
> > > > > % while (panic_cpu != NOCPU)
> > > > > % ; /* nothing */
> > > > >
> > > > > The very first access can't reasonably use a cachec value. atomic_cmpset()
> > > > > can change panic_cpu, so even without the memory barrier, panic_cpu must
> > > > > be reloaded for the third access the first time. But then when the third
> > > > > access is repeated in the second while loop, the missing atomic op with
> > > > > barrier makes things very broken. panic_cpu isn't changed by the loop,
> > > > > and the compiler thinks that it isn't changed by anything else either, so
> > > > > the compiler may reduce the loop to:
> > > > >
> > > > > % if (panic_cpu != NOCPU)
> > > > > % for (;;)
> > > > > % ; /* nothing */
> > > >
> > > >
> > > > Yes, it's exactly the last loop that had the problem.
> > > > On amd64 with -O2:
> > > > .loc 1 544 0
> > > > movl panic_cpu(%rip), %eax
> > > > .LVL134:
> > > > .p2align 4,,7
> > > > .L210:
> > > > cmpl $255, %eax
> > > > jne .L210
> > > > jmp .L225
> > > >
> > > > > except I've seen claims that even an endless for loop can be optimized
> > > > > to nothing. Declaring panic_cpu as volatile prevents the compiler doing
> > > > > this, but I think it is insufficient. We really do want to see panic_cpu
> > > > > changed by other CPUs, and what is the point of atomic_load_acq*() if not
> > > > > to use for this -- if declaring things volatile were sufficient, then we
> > > > > could just use *(volatile *)&var. atomic_load/store_acq/rel*() on i386
> > > >
> > > > I discussed this with kib and the idea is that atomic operation is not needed in
> > > > that place and volatile is sufficient, because we don't need barrier semantics
> > > > there and only want to ensure that variable is reloaded from memory.
> > >
> > > If I understand correctly, without acquiring barrier, variable is not
> > > guaranteed to be loaded from memory, it can end up with some stale value from
> > > cpu's cache or pipeline. If incorrect value will be checked in the first "if"
> > > operator, it can lead to skiping all the atomic synchronization.
> > > The same about writing to variable, without barrier new value may be written to
> > > some local cpu cache and not be seen by readers until it is flushed from cache.
> >
> > No, a barrier does _not_ force any cache flushes. The point of the volatile
> > is to serve as a compiler barrier. A memory barrier only enforces a ordering
> > in memory operations in the CPU, it does not flush any caches or force a CPU
> > to post writes to memory. It is weaker than that. :) For example, the
> > sparcv9 manuals explicitly state something along the lines that any write may
> > be held in the CPU's store buffer for an unspecified amount of time.
> > Barriers do not alter that, nor would it really be useful for them to do so.
> > What barriers allow you to do is order the operations on a lock cookie (such
> > as mtx_lock in mutexes) with respect to operations on other memory locations
> > (e.g. the data a lock protects). By using a specific set of barriers and
> > protocols for accessing the lock cookie, you can ensure that the protected
> > data is not accessed without holding the lock. However, for a standalone
> > word, memory barriers do not buy you anything. And in fact, if one CPU has
> > written to a variable and that write is still sitting in the store buffer,
> > 'atomic_load_acq' on another CPU may still return a stale value. All that
> > FreeBSD requires is that 'atomic_cmpset' will not report success using stale
> > data. It can either fail or block waiting for the write in a store buffer in
> > another CPU to drain. We typically handle the 'fail' case by continuing to
> > loop until we observe a new state or succesfully perform 'atomic_cmpset'
> > (for an example, see setting of MTX_CONTESTED in mtx_lock).
> >
> > Currently we do not have 'atomic_load()' and 'atomic_store()' variants without
> > memory barriers since 'x = y' is atomic (in that it is performed as a single
> > operation). However, there are cases where 'x = y' needs a compiler memory
> > barrier (but not a CPU one). (e.g. if 'y' can be updated by a signal handler
> > in userland, or an interrupt in the kernel.) That is what 'volatile' is for.
> >
> > --
> > John Baldwin
> > _______________________________________________
> > svn-src-head at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/svn-src-head
> > To unsubscribe, send any mail to "svn-src-head-unsubscribe at freebsd.org"
> >
>
--
John Baldwin
More information about the svn-src-all
mailing list