lockless file descriptor lookup

Jeff Roberson jroberson at jroberson.net
Tue May 12 03:50:26 UTC 2009


This patch implements a lockless lookup path for file descriptors.  The 
meat of the algorithm is in fget_unlocked().  This returns a referenced 
file descriptor, unlike fget_locked().  In the common case this reduces 
the number of atomics required for fget() while allowing for lookups to 
proceed concurrently with modifications to the table and preventing 
preemption from causing context switches.

Using the libMicro 4.0 benchmarking suite with a thread count of 16 on an 
8core box yields improvements by as much as 428% in descriptor heavy 
tests.  There were no performance regressions with this benchmark.

The code works by allowing lookup threads to follow two previously unsafe 
pointers.  First, the file descriptor table itself is never freed on 
expansion until the process exits.  That ensures that no pagefaults or 
random memory access can occur if expansion happens after the table 
pointer is fetched.  Given that the vast majority of processes never 
expand their descriptor table, it is not any significant memory overhead 
to save them.  I shamelessly stole this idea from NetBSD.

The struct files themselves are marked as UMA_ZONE_NOFREE and never 
reclaimed.  This allows us to safely attempt to reference count them 
without any locks.  To prevent fdrop() races fget_unlocked() uses a cmpset 
loop to ensure that it never raises the reference count above zero.  In 
this way it can never reference a free'd or recently allocated file.

Once the file descriptor is resolved, we verify the path via the 
descriptor table once more to ensure that it has not changed.  At this 
point, we have a valid reference or we drop an invalid reference and 

This gives us the overhead of only one atomic instruction for common case 
file access.  In the worst case there can be some spinning in the loop in 
fget_unlocked(), but some thread always makes forward progress for each 
iteration of the loop.

I'm going to see if the usual suspects will stress test this but I'd like 
to see it in 8.0.  This is your chance to make any counter arguments.

I'd also appreciate it if someone could look at my volatile cast and make 
sure I'm actually forcing the compiler to refresh the fd_ofiles array 

+		if (fp == ((struct file *volatile*)fdp->fd_ofiles)[fd])


More information about the freebsd-arch mailing list