lockless file descriptor lookup

Bruce Evans brde at optusnet.com.au
Thu May 21 09:37:48 UTC 2009

On Wed, 20 May 2009, John Baldwin wrote:

> On Wednesday 20 May 2009 2:59:52 pm Jeff Roberson wrote:
>> On Thu, 14 May 2009, Bruce Evans wrote:
>>> Anyway, you probably need atomics that have suitable memory barriers.
>>> Memory barriers must affect the compiler and make it perform refreshes
>>> for them to work, so you shouldn't need any volatile casts.  E.g., all
>>> atomic store operations (including cmpset) have release semantics even
>>> if they aren't spelled with "_rel" or implemented using inline asm.
>>> On amd64 and i386, they happen to be implemented using inline asm with
>>> "memory" clobbers.  The "memory" clobbers force refreshes of all
>>> non-local variables.

Actually, it is the "acquire" operations that happen to be implemented
with "memory" clobbers on amd64 and i386.  "release" semantics are
(completely?) automatic on amd64 and i386 so no "memory" clobbers are
used for them (except IIRC in old versions).

>> So I think I need an _acq memory barrier on the atomic cmpset of the
>> refcount to prevent speculative loading of the fd_ofiles array pointer by
>> the processor and the volatile in the second dereference as I have it
>> now to prevent caching of the pointer by the compiler.  What do you think?

I thought that it was a _rel barrier that was needed due to my misreading
of the "memory" clobbers corrected above.  Perhaps both _acq and _rel
are needed in cases like yours where a single cmpset corresponds to a
(lock, unlock) pair.  On amd64 and i386, plain atomic_cmpset already
has both (_acq via the explicit "memory" clobber, and _rel implicitly),
but the man page doesn't say that this is generic.  It only says that
all stores have _rel semantics, and it uses an explicit _aqu suffixes
in examples of how to use cmpset to implement locking (the examples
are rotted copies of locking in sys/mutex.h).  Since a
successful plain cmpset does a store, this implicitly says that plain
cmpset's have _rel semantics and cmpset_acq has both _acq and _rel

Mutex locking has always been careful to use an explicit _acq suffix,
but most code in /sys isn't.  In a /sys tree deated ~March 30, there
are 280 lines matching atomic_cmpset but only 72 lines matching
atomic_cmpset_acq and 47 lines matching atomic_cmpset_rel.  Excluding
the implementation (atomic.h), there are 153 lines matching atomic_cmpset,
35 matching atomic_cmpset_acq and 12 matching atomic_cmpset_rel; this
gives 106 lines that are probably missing an _acq or a _rel suffix.
No one replied to my previous mails about this.  I would require
explicit suffix by not supporting plain cmpset, or not support the
_rel suffix for stores since because stores are always _rel, it is hard
to tell if an atomic store without the suffix really wants non-_rel or
is sloppy.  Despite the proliferation of interfaces, there is no
_acq_rel suffix to indicate that cmpset_acq is also _rel.

>> The references prior to the atomic increment have no real ordering
>> requirements.  Only the ones afterwards need to be strict so that we can
>> verify the results.

Most references are in a loop, so "before" and "after" are sort of the saeme:

%	for (;;) {
%		fp = fdp->fd_ofiles[fd];
%		if (fp == NULL)
%			break;
%		count = fp->f_count;
%		if (count == 0)
%			continue;
%		if (atomic_cmpset_int(&fp->f_count, count, count + 1) != 1)
%			continue;

I think we do depend on both _acq and _rel semantics here -- the missing
_acq to volatilize everything, and the implicit _rel just (?) to force
the memory copy of f_count to actually be incremented, as is required
for an atomic store to actually work.

%		if (fp == ((struct file *volatile*)fdp->fd_ofiles)[fd])
%			break;

The RHS here could be used again at the top of the loop.  The load for
the RHS is ordered after the cmpset, and so is the one at the top of
the loop, except for the first iteration.  I think this is unimportant.

%		fdrop(fp, curthread);
%	}

> I think having the _acq is correct and that the "memory" clobber it contains
> will force the compiler to reload fd_ofiles without needing the volatile cast
> (and thus that you can remove the volatile cast altogether and just add the
> _acq barrier).

I agree.

Please look at whether some of the ~106 other plain cmpset's need and
_acq prefix or should have a _rel prefix for clarity.  You should be
able to do this much faster than me, having written some of them :-).
E.g., the one in sio.c is for implementing a lock so it shuld use _acq
(though it might work without _acq since the lock is only used once),
but the ones in sx.h and kern_sx.c might be correct since they are
mostly for "trylock"-type operations.


More information about the freebsd-arch mailing list