ipv6 lock contention with parallel socket io

Tue Dec 31 07:35:38 UTC 2013

Hi,

I've noticed a bunch of lock contention occurs when doing highly
parallel (> 4096 sockets) TCP IPv6 traffic.

The contention is here:

root at c001-freebsd11:/home/adrian/git/github/erikarn/libiapp # sysctl
debug.lock.prof.stats | head -2 ; sysctl debug.lock.prof.stats | sort
-nk4 | tail -10
debug.lock.prof.stats:
     max  wait_max       total  wait_total       count    avg wait_avg
cnt_hold cnt_lock name
     507      1349      116285      233979      617482      0      0
0  11927 /usr/home/adrian/work/freebsd/head/src/sys/netinet/tcp_hostcache.c:291
(sleep mutex:tcp_hc_entry)
     499       995      122943      479346      617480      0      0
0 104210 /usr/home/adrian/work/freebsd/head/src/sys/netinet6/in6_src.c:885
(sleep mutex:rtentry)
     515       202      493751      581039      617481      0      0
0  12779 /usr/home/adrian/work/freebsd/head/src/sys/netinet6/in6.c:2376
(rw:lle)
    1872      2020     1542355     1529313      617481      2      2
0  97308 /usr/home/adrian/work/freebsd/head/src/sys/netinet6/nd6.c:2229
(rw:if_afdata)
     494      1066      141964     1892922      617481      0      3
0 503429 /usr/home/adrian/work/freebsd/head/src/sys/net/flowtable.c:1251
(sleep mutex:rtentry)
     388      1121      161966     2152377      617482      0      3
0 397770 /usr/home/adrian/work/freebsd/head/src/sys/netinet/tcp_output.c:1198
(sleep mutex:rtentry)
       7       849      603349     2431982      499778      1      4
0 279708 /usr/home/adrian/work/freebsd/head/src/sys/kern/subr_turnstile.c:551
(spin mutex:turnstile chain)
     539      1171      844350     5776354     1852441      0      3
0 1254017 /usr/home/adrian/work/freebsd/head/src/sys/net/route.c:380
(sleep mutex:rtentry)
     203      2849      851312     7862530      617481      1     12
0 139389 /usr/home/adrian/work/freebsd/head/src/sys/netinet6/nd6.c:1894
(rw:if_afdata)
      36      2401      363853    18179236      508578      0     35
0 125063 /usr/home/adrian/work/freebsd/head/src/sys/netinet6/ip6_input.c:701
(rw:if_afdata)
root at c001-freebsd11:/home/adrian/git/github/erikarn/libiapp #

.. it's the IF_ADATA lock surrounding the lla_lookup() calls.

Now:

* is there any reason this isn't an rmlock?
* the instance early on in nd6_output_lle() isn't taking the read
lock, it's taking the full lock. Is there any reason for this?

I don't have much experience or time to spend on optimising the ipv6
code to scale better but this seems like one of those things that'll
bite us in the ass as the amount of ipv6 deployed increases.

Does anyone have any ideas/suggestions on how we could improve things?

Thanks,

-a