Bug in kern_umtx.c -- read-write locks

Garrett Cooper yanefbsd at gmail.com
Wed Feb 3 05:16:30 UTC 2010


On Tue, Feb 2, 2010 at 8:05 PM, David Xu <davidxu at freebsd.org> wrote:
> Justin Teller wrote:
>>
>> I was working on a highly threaded app (125+ threads) that was using
>> the pthread rw locks, and we were stalling at strange times.  After a
>> lot of debugging in our app, we found that a call to
>> pthread_rwlock_wrlock() would sometimes never return -- it seemed like
>> a wakeup was lost.  After we convinced ourselves the bug wasn't in the
>> app's locking code, I started digging into the kernel.  I found that
>> there is an issue where a wakeup can be "lost" when a thread goes to
>> sleep calling pthread_rwlock_wrlock.  The issue is in the file
>> kern_umtx.c in the function do_rw_wrlock(): the code busies the lock
>> before sleeping, but when it tries to set the waiters bit, it's
>> looking at at old value (from the "try-lock" just before the busy).
>> This allows a race where a thread can go to sleep w/o setting the
>> waiters bit.  Then the last thread to unlock won't wakeup the sleeping
>> thread.  The patch below (based off of 8.0 release) fixes my problem
>> for the write lock and should fix the complimentary issue in
>> do_rw_rdlock.
>>
>>  <snip>
>
> Committed, thanks!

    This might be the reason why the pthreaded application I was
working on was crashing when I had it spawn more than 100 threads (I
tried 2k and 20k simple, short-lived threads that used a basic mutex,
and it got into some deadlock state and bombed)... I'll see whether or
not this fixes my issue as well (but FWIW Linux sucked when I ran the
pthreaded app too and was busting up all over the place)...
Thanks!
-Garrett


More information about the freebsd-current mailing list