lockless file descriptor lookup

Jeff Roberson jroberson at jroberson.net
Wed May 20 18:55:55 UTC 2009

On Thu, 14 May 2009, Bruce Evans wrote:

> On Tue, 12 May 2009, Jeff Roberson wrote:
>> On Tue, 12 May 2009, Dag-Erling Sm?rgrav wrote:
>>> Jeff Roberson <jroberson at jroberson.net> writes:
>>>> I'd also appreciate it if someone could look at my volatile cast and
>>>> make sure I'm actually forcing the compiler to refresh the fd_ofiles
>>>> array here:
>>>> +		if (fp == ((struct file *volatile*)fdp->fd_ofiles)[fd])
> This has 2 style bugs (missing space after first '*' and missing space
> before second '*'.
> It isn't clear whether you want to refresh the fd_ofiles pointer to the
> (first element of) the array, or the fd'th element.  It is clear that
> you don't want to refresh the whole array.  The above refreshes the
> fd'th element.  Strangely, in my tests gcc refreshes the fd'th element
> even without the cast.  E.g.,

This is actually intended to catch cases where the descriptor array has 
expanded and the pointer to fd_ofiles has changed, or the file has been 
closed and the pointer at the fd'th element has changed.  I'm attempting 
to force the compiler to reload the fd_ofiles array pointer from the fdp 
structure.  If it has done that, it can not have the fd'th element cached 
and so that must be a fresh memory reference.

> 	test(fdp->fd_ofiles[fd], fdp->fd_ofiles[fd]);
> results in 1 memory access for each of the [fd]'s.
>>> The problem is that since it is not declared as volatile, some other
>>> piece of code may have modified it but not yet flushed it to RAM.
>> That is an acceptable race due to other guarantees.  If it hasn't been 
>> committed to memory yet, the old table still contains valid data.  I only 
>> need to be certain that the compiler doesn't cache the original ofiles 
>> value. It can't anyway because atomics use inline assembly on all platforms 
>> but I'd like it to be explicit anyway.
> It shouldn't matter that atomics use inline asm.  Non-broken inline
> asm declares all its inputs and outputs, so compilers can see what it
> changes just as easily as for C code (and more easily than for non-
> inline asm or C).

This is a good point.  It's all the more important that we get the 
volatile/memory barrier worked out correctly then.  I don't believe there 
are bugs today but it may be due to side-effects we shouldn't count on.

> Anyway, you probably need atomics that have suitable memory barriers.
> Memory barriers must affect the compiler and make it perform refreshes
> for them to work, so you shouldn't need any volatile casts.  E.g., all
> atomic store operations (including cmpset) have release semantics even
> if they aren't spelled with "_rel" or implemented using inline asm.
> On amd64 and i386, they happen to be implemented using inline asm with
> "memory" clobbers.  The "memory" clobbers force refreshes of all
> non-local variables.

So I think I need an _acq memory barrier on the atomic cmpset of the 
refcount to prevent speculative loading of the fd_ofiles array pointer by 
the processor and the volatile in the second dereference as I have it 
now to prevent caching of the pointer by the compiler.  What do you think?

The references prior to the atomic increment have no real ordering 
requirements.  Only the ones afterwards need to be strict so that we can 
verify the results.


> Bruce

More information about the freebsd-arch mailing list