Fine-grained locking for POSIX local sockets (UNIX domain
sockets)
Kris Kennaway
kris at obsecurity.org
Mon May 8 06:52:11 UTC 2006
OK, David's patch fixes the umtx thundering herd (and seems to give a
4-6% boost). I also fixed a thundering herd in FILEDESC_UNLOCK (which
was also waking up 2-7 CPUs at once about 30% of the time) by doing
s/wakeup/wakeup_one/. This did not seem to give a performance impact
on this test though.
It seems to me that a more useful way to sort the mutex profiling list
is by ration of contention to total acquisitions. Here is the list
resorted by cnt_hold/count, keeping only the top 40 values of count
and the mutexes with nonzero contention:
Before:
max total count avg cnt_hold cnt_lock ratio name
275 507115 166457 3 907 1348 .005 kern/vfs_bio.c:357 (needsbuffer lock)
310 487209 166460 2 1158 645 .006 kern/vfs_bio.c:315 (needsbuffer lock)
1084 3860336 166507 23 1241 1377 .007 kern/vfs_bio.c:1445 (buf queue lock)
1667 35018604 320038 109 3877 0 .012 kern/uipc_usrreq.c:696 (unp_mtx)
379 2143505 635740 3 10736 37083 .016 kern/sys_socket.c:176 (so_snd)
1503 4311935 502656 8 8664 9312 .017 kern/kern_lock.c:163 (lockbuilder mtxpool)
875 3495175 166487 20 3394 4272 .020 kern/vfs_bio.c:2424 (vnode interlock)
2084 121390320 2880081 42 67339 79525 .023 kern/uipc_usrreq.c:581 (so_snd)
909 1809346 165769 10 4454 9597 .026 kern/vfs_vnops.c:796 (vnode interlock)
277 518716 166442 3 5034 5172 .030 kern/vfs_bio.c:1464 (vnode interlock)
1565 10515648 282278 37 15760 10821 .055 kern/subr_sleepqueue.c:374 (process lock)
492 2500241 634835 3 54003 62520 .085 kern/kern_sig.c:1002 (process lock)
569 335913 30022 11 3262 2176 .108 kern/kern_sx.c:245 (lockbuilder mtxpool)
1378 27840143 320038 86 42183 1453 .131 kern/uipc_usrreq.c:705 (so_rcv)
300 1011100 320045 3 52423 30742 .163 kern/uipc_socket.c:1101 (so_snd)
437 10472850 3200213 3 576918 615361 .180 kern/kern_resource.c:1172 (sleep mtxpool)
2052 46242974 320039 144 80690 80729 .252 kern/uipc_usrreq.c:617 (unp_global_mtx)
546 48160602 3683470 13 1488801 696814 .404 kern/kern_descrip.c:1988 (filedesc structure)
395 13842967 3683470 3 1568927 685295 .425 kern/kern_descrip.c:1967 (filedesc structure)
644 16700212 635731 26 606615 278511 .954 kern/kern_descrip.c:420 (filedesc structure)
384 2863741 635774 4 654035 280340 1.028 kern/kern_descrip.c:368 (filedesc structure)
604 22164433 2721994 8 5564709 2225496 2.044 kern/kern_synch.c:220 (process lock)
After:
max total count avg cnt_hold cnt_lock ratio name
168 467413 166364 2 1025 2655 .006 kern/vfs_bio.c:357 (needsbuffer lock)
264 453972 166364 2 1688 44 .010 kern/vfs_bio.c:315 (needsbuffer lock)
240 2011519 640106 3 12032 48460 .018 kern/sys_socket.c:176 (so_snd)
425 5394174 514469 10 12838 15343 .024 kern/kern_lock.c:163 (lockbuilder mtxpool)
514 5127131 166383 30 4417 5666 .026 kern/vfs_bio.c:1445 (buf queue lock)
261 199860 38442 5 1405 475 .036 kern/kern_sx.c:245 (lockbuilder mtxpool)
707 174604101 2880083 60 119723 84566 .041 kern/uipc_usrreq.c:581 (so_snd)
126 520485 166351 3 7850 8574 .047 kern/vfs_bio.c:1464 (vnode interlock)
364 1850567 165607 11 8077 22156 .048 kern/vfs_vnops.c:796 (vnode interlock)
499 3233479 166432 19 9258 8468 .055 kern/vfs_bio.c:2424 (vnode interlock)
754 42181810 320038 131 21236 0 .066 kern/uipc_usrreq.c:696 (unp_mtx)
462 21081419 3685605 5 316514 243585 .085 kern/kern_descrip.c:1988 (filedesc structure)
577 12178436 321182 37 28585 21082 .088 kern/subr_sleepqueue.c:374 (process lock)
221 2410704 640387 3 75056 77553 .117 kern/kern_sig.c:1002 (process lock)
309 12026860 3685605 3 468707 331121 .127 kern/kern_descrip.c:1967 (filedesc structure)
299 973885 320046 3 60629 72506 .189 kern/uipc_socket.c:1101 (so_snd)
471 6132557 640097 9 125478 98778 .196 kern/kern_descrip.c:420 (filedesc structure)
737 33114067 320038 103 85243 1 .266 kern/uipc_usrreq.c:705 (so_rcv)
454 5866777 878113 6 240669 364921 .274 kern/kern_synch.c:220 (process lock)
365 2308060 640133 3 183152 142569 .286 kern/kern_descrip.c:368 (filedesc structure)
220 10297249 3200211 3 1117448 1175412 .349 kern/kern_resource.c:1172 (sleep mtxpool)
947 57806295 320040 180 132456 109179 .413 kern/uipc_usrreq.c:617 (unp_global_mtx)
filedesc contention is down by a factor of 3-4, with corresponding
reduction in the average hold time. The process lock contention
coming from the signal delivery wakeup has also gone way down for some
reason.
unp contention has risen a bit. The other big gain is to sleep
mtxpool contention, which roughly doubled:
/*
* Change the total socket buffer size a user has used.
*/
int
chgsbsize(uip, hiwat, to, max)
struct uidinfo *uip;
u_int *hiwat;
u_int to;
rlim_t max;
{
rlim_t new;
UIDINFO_LOCK(uip);
So the next question is how can that be optimized?
Kris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-performance/attachments/20060508/2874539f/attachment-0001.pgp
More information about the freebsd-performance
mailing list