lockless file descriptor lookup
jroberson at jroberson.net
Tue May 12 03:50:26 UTC 2009
This patch implements a lockless lookup path for file descriptors. The
meat of the algorithm is in fget_unlocked(). This returns a referenced
file descriptor, unlike fget_locked(). In the common case this reduces
the number of atomics required for fget() while allowing for lookups to
proceed concurrently with modifications to the table and preventing
preemption from causing context switches.
Using the libMicro 4.0 benchmarking suite with a thread count of 16 on an
8core box yields improvements by as much as 428% in descriptor heavy
tests. There were no performance regressions with this benchmark.
The code works by allowing lookup threads to follow two previously unsafe
pointers. First, the file descriptor table itself is never freed on
expansion until the process exits. That ensures that no pagefaults or
random memory access can occur if expansion happens after the table
pointer is fetched. Given that the vast majority of processes never
expand their descriptor table, it is not any significant memory overhead
to save them. I shamelessly stole this idea from NetBSD.
The struct files themselves are marked as UMA_ZONE_NOFREE and never
reclaimed. This allows us to safely attempt to reference count them
without any locks. To prevent fdrop() races fget_unlocked() uses a cmpset
loop to ensure that it never raises the reference count above zero. In
this way it can never reference a free'd or recently allocated file.
Once the file descriptor is resolved, we verify the path via the
descriptor table once more to ensure that it has not changed. At this
point, we have a valid reference or we drop an invalid reference and
This gives us the overhead of only one atomic instruction for common case
file access. In the worst case there can be some spinning in the loop in
fget_unlocked(), but some thread always makes forward progress for each
iteration of the loop.
I'm going to see if the usual suspects will stress test this but I'd like
to see it in 8.0. This is your chance to make any counter arguments.
I'd also appreciate it if someone could look at my volatile cast and make
sure I'm actually forcing the compiler to refresh the fd_ofiles array
+ if (fp == ((struct file *volatile*)fdp->fd_ofiles)[fd])
More information about the freebsd-arch