Bug in kern_umtx.c -- read-write locks

Justin Teller justin.teller at gmail.com
Tue Feb 2 20:07:41 UTC 2010


I was working on a highly threaded app (125+ threads) that was using
the pthread rw locks, and we were stalling at strange times.  After a
lot of debugging in our app, we found that a call to
pthread_rwlock_wrlock() would sometimes never return -- it seemed like
a wakeup was lost.  After we convinced ourselves the bug wasn't in the
app's locking code, I started digging into the kernel.  I found that
there is an issue where a wakeup can be "lost" when a thread goes to
sleep calling pthread_rwlock_wrlock.  The issue is in the file
kern_umtx.c in the function do_rw_wrlock(): the code busies the lock
before sleeping, but when it tries to set the waiters bit, it's
looking at at old value (from the "try-lock" just before the busy).
This allows a race where a thread can go to sleep w/o setting the
waiters bit.  Then the last thread to unlock won't wakeup the sleeping
thread.  The patch below (based off of 8.0 release) fixes my problem
for the write lock and should fix the complimentary issue in
do_rw_rdlock.

--- kern_umtx.c	2010-02-01 13:03:24.000000000 -0800
+++ /kernel_src_8.0/usr/src/sys/kern/kern_umtx.c	2010-02-01
09:52:18.000000000 -0800
@@ -2461,10 +2461,15 @@
 		/* grab monitor lock */
 		umtxq_lock(&uq->uq_key);
 		umtxq_busy(&uq->uq_key);
 		umtxq_unlock(&uq->uq_key);

+		/* re-read the state, in case it changed between the try-lock above
+		 * and the check below
+		 */
+		state = fuword32(__DEVOLATILE(int32_t *, &rwlock->rw_state));
+
 		/* set read contention bit */
 		while ((state & wrflags) && !(state & URWLOCK_READ_WAITERS)) {
 			oldstate = casuword32(&rwlock->rw_state, state, state |
URWLOCK_READ_WAITERS);
 			if (oldstate == state)
 				goto sleep;
@@ -2592,10 +2597,15 @@

 		/* grab monitor lock */
 		umtxq_lock(&uq->uq_key);
 		umtxq_busy(&uq->uq_key);
 		umtxq_unlock(&uq->uq_key);
+
+		/* re-read the state, in case it changed between the try-lock above
+		 * and the check below
+		 */
+		state = fuword32(__DEVOLATILE(int32_t *, &rwlock->rw_state));

 		while (((state & URWLOCK_WRITE_OWNER) || URWLOCK_READER_COUNT(state) != 0) &&
 		       (state & URWLOCK_WRITE_WAITERS) == 0) {
 			oldstate = casuword32(&rwlock->rw_state, state, state |
URWLOCK_WRITE_WAITERS);
 			if (oldstate == state)


Looking thru the open PRs, it looks like it might be the cause of
#135673, but I'm not completely sure.  Also, I'm pretty new to
contributing to the FreeBSD community -- what's the common practice of
responding to a PR?

Thanks
-Justin


More information about the freebsd-current mailing list