Fine-grained locking for POSIX local sockets (UNIX domain sockets)

David Xu davidxu at
Mon May 8 10:43:37 UTC 2006

On Monday 08 May 2006 14:52, Kris Kennaway wrote:
> OK, David's patch fixes the umtx thundering herd (and seems to give a
> 4-6% boost).  I also fixed a thundering herd in FILEDESC_UNLOCK (which
> was also waking up 2-7 CPUs at once about 30% of the time) by doing
> s/wakeup/wakeup_one/.  This did not seem to give a performance impact
> on this test though.
> filedesc contention is down by a factor of 3-4, with corresponding
> reduction in the average hold time.  The process lock contention
> coming from the signal delivery wakeup has also gone way down for some
> reason.

I found that mysqld frequently calls alarm() in its file thr_alarm.c and 
thr_kill() to send SIGALRM to its timer thread to wake it up, the timer 
thread itself is being blocked in sigwait(), normally the alarm timer will
be expired in a second, so the kernel will periodically call psignal to find
a thread which can handle the signal, it means kernel has to periodically
walk through thread list with process lock and scheduler held, this is
very expensive.

thr_kill will in most time wake up the timer thread earlier, in thr_kill
syscall,  kernel has to walk through thread list to find a thread whose
thread is matching the given id, the function thread_find()
uses a linear searching algorithm, it is slow, if there are lots of thread in
the process,  the process lock will be holden too long, I think that's the 
reason why you have seen so many process lock contention, if you
define USE_ALARM_THREAD in mysql header file, the contention should
be decreased ( I hope ), patch:

--- my_pthread.h.old	Mon May  8 18:16:56 2006
+++ my_pthread.h	Mon May  8 18:17:07 2006
@@ -267,6 +267,8 @@
 /* Test first for RTS or FSU threads */
 #define HAVE_rts_threads
 extern int my_pthread_create_detached;

> unp contention has risen a bit.  The other big gain is to sleep
> mtxpool contention, which roughly doubled:
> /*
>  * Change the total socket buffer size a user has used.
>  */
> int
> chgsbsize(uip, hiwat, to, max)
>         struct  uidinfo *uip;
>         u_int  *hiwat;
>         u_int   to;
>         rlim_t  max;
> {
>         rlim_t new;
>         UIDINFO_LOCK(uip);
> So the next question is how can that be optimized?
may use atomic_cmpset_int in a loop to avoid context switch or use an
adaptive mutex, but there is no adaptive mutex type you can specify.
rlim_t is a 64bit integer, so atomic operation can not be used, but 64bit 
integer might not be necessary for socket buffer size.

> Kris

David Xu

More information about the freebsd-performance mailing list