Questions about mutex implementation in kern/kern_mutex.c

Fri Sep 17 17:42:45 UTC 2010

On Thu, Sep 16, 2010 at 02:16:05PM -0400, John Baldwin wrote:
> On Thursday, September 16, 2010 1:33:07 pm Andrey Simonenko wrote:
>
> > "Current" value means that the value of a variable read by one thread
> > is equal to the value of this variable successfully updated by another
> > thread by the compare-and-set instruction.  As I understand from the kernel
> > source code, atomic_cmpset_ptr() allows to update a variable in a way that
> > all other CPUs will invalidate corresponding cache lines that contain
> > the value of this variable.
> 
> That is not true.  It is likely true on x86, but it is certainly not true on
> other architectures such as sparc64 where a write may be held in a store 
> buffer for an indeterminate amount of time (and note that some lock releases 
> are simple stores with a "rel" memory barrier).  All that we require is that 
> if the value is stale, the atomic_cmpset() that attempts to set MTX_CONTESTED 
> will fail.

I missed _release_lock_quick() call in _mtx_unlock_sleep().

> 
> > The mtx_owned(9) macro uses this property, mtx_owned() does not use anything
> > special to compare the value of m->mtx_lock (volatile) with current thread
> > pointer, all other functions that update m->mtx_lock of unowned mutex use
> > compare-and-set instruction.  Also I cannot find anything special in
> > generated Assembler code for volatile variables (except for ia64 where
> > acquire loads and release stores are used).
> 
> No, mtx_owned() is just not harmed by the races it loses.  You can certainly 
> read a stale value of mtx_lock in mtx_owned() if some other thread owns the 
> lock or has just released the lock.  However, we don't care, because in both 
> of those cases, mtx_owned() returns false.  What does matter is that 
> mtx_owned() can only return true if we currently hold the mutex.  This works 
> because 1) the same thread cannot call mtx_unlock() and mtx_owned() at the 
> same time, and 2) even CPUs that hold writes in store buffers will snoop their 
> store buffer for local reads on that CPU.  That is, a given CPU will never 
> read a stale value of a memory word that is "older" than a write it has 
> performed to that word.

Looks like I understand the logic why mtx_owned() works correctly when
mtx_lock is present in CPU cache or is absent in CPU cache.  The mtx_lock
value definitely can say whether lock is held by the current thread, but
it cannot say whether it is unowned or is owned by another thread.

Let me ask another one question about memory barriers and thread migration.

Let a thread locked a mutex, modified shared data protected by this mutex
and was migrated from CPU1 to CPU2 (mutex is still locked).  In this scenario
just migrated thread will not see stale data for a mutex itself (the
m->mtx_lock value) and for shared data on CPU2 because when it was migrated
from CPU1 there was at least one unlock call for some another mutex that had
release semantics and appropriate memory barrier instruction was run
implicitly or explicitly.  As a result this "rel" memory barrier made all
modifications from CPU1 visible on another CPUs.  When CPU2 switched to just
migrated thread there was at least on lock call for some another mutex with
acquire semantics, so "rel/acq" memory barriers pair works here together.
(Also I consider case when CPU2 did not work with that mutex, but worked
with its memory before.  Some thread on CPU2 could allocate some memory,
worked with it and freed it.  Later the same part of memory was allocated
by a thread on CPU1 for mutex).

Is the above written description correct?

Such logic of memory barriers is described in detail in Sparc v9 documentation
book in MEMBAR instruction description.  Actually MEMBAR with appropriate
masks is used in atomic.h for this architecture.  As I understand the same
logic for memory barriers (atomic_..._rel and atomic_..._acq) is applicable
to all other architectures.  Otherwise I do not understand how mtx_lock()
and mtx_unlock() pair can protect data and can ensure that a thread that
locked a mutex will see correct (not stale) data protected by this mutex.

> > There are some places in the kernel where a variable is updated in
> > something like "do { v = value; } while (!atomic_cmpset_int(&value, ...));"
> > and that variable is not "volatile", but the compiler generates correct
> > Assembler code.  So "volatile" is not a requirement for all cases.
> 
> Hmm, I suspect that many of those places actually do use volatile.  The 
> various lock cookies (mtx_lock, etc.) are declared volatile in the structure.  
> Otherwise the compiler would be free to conclude that 'v = value;' is a loop 
> invariant and move it out of the loop which would break.  Given that, the 
> construct you referred to does in fact require 'value' to be volatile.

I checked Assembler code for these functions:

kern/subr_msgbuf.c:msgbuf_addchar()
vm/vm_map.c:vmspace_free()

Thank your for answers.