svn commit: r333744 - in head/sys: kern sys
Matthew Macy
mmacy at freebsd.org
Thu Jun 7 05:02:01 UTC 2018
Try r334756 and then send me your scripts for any panics you can produce.
-M
On Wed, Jun 6, 2018 at 7:12 PM, Matthew Macy <mmacy at freebsd.org> wrote:
> On Wed, Jun 6, 2018 at 12:03 PM, Peter Holm <peter at holm.cc> wrote:
>> On Thu, May 17, 2018 at 05:59:35PM +0000, Matt Macy wrote:
>>> Author: mmacy
>>> Date: Thu May 17 17:59:35 2018
>>> New Revision: 333744
>>> URL: https://svnweb.freebsd.org/changeset/base/333744
>>>
>>> Log:
>>> AF_UNIX: make unix socket locking finer grained
>>>
>>> This change moves to using a reference count across lock drop / reacquire
>>> to guarantee liveness.
>>>
>>> Currently sends on unix sockets contend heavily on read locking the list lock.
>>> unix1_processes in will-it-scale peaks at 6 processes and then declines.
>>>
>>> With this change I get a substantial improvement in number of operations per second
>>> with 96 processes:
>>>
>>> x before
>>> + after
>>> N Min Max Median Avg Stddev
>>> x 11 1688420 1696389 1693578 1692766.3 2971.1702
>>> + 10 63417955 71030114 70662504 69576423 2374684.6
>>> Difference at 95.0% confidence
>>> 6.78837e+07 +/- 1.49463e+06
>>> 4010.22% +/- 88.4246%
>>> (Student's t, pooled s = 1.63437e+06)
>>>
>>> And even for 2 processes shows a ~18% improvement.
>>> "Small" iron changes (1, 2, and 4 processes):
>>>
>>> x before1
>>> + after1.2
>>> +------------------------------------------------------------------------+
>>> | + |
>>> | x + |
>>> | x + |
>>> | x + |
>>> | x ++ |
>>> | xx ++ |
>>> |x x xx ++ |
>>> | |__________________A_____M_____AM____||
>>> +------------------------------------------------------------------------+
>>> N Min Max Median Avg Stddev
>>> x 10 1131648 1197750 1197138.5 1190369.3 20651.839
>>> + 10 1203840 1205056 1204919 1204827.9 353.27404
>>> Difference at 95.0% confidence
>>> 14458.6 +/- 13723
>>> 1.21463% +/- 1.16683%
>>> (Student's t, pooled s = 14605.2)
>>>
>>> x before2
>>> + after2.2
>>> +------------------------------------------------------------------------+
>>> | +|
>>> | +|
>>> | +|
>>> | +|
>>> | +|
>>> | +|
>>> | x +|
>>> | x +|
>>> | x xx +|
>>> |x xxxx +|
>>> | |___AM_| A|
>>> +------------------------------------------------------------------------+
>>> N Min Max Median Avg Stddev
>>> x 10 1972843 2045866 2038186.5 2030443.8 21367.694
>>> + 10 2400853 2402196 2401043.5 2401172.7 385.40024
>>> Difference at 95.0% confidence
>>> 370729 +/- 14198.9
>>> 18.2585% +/- 0.826943%
>>> (Student's t, pooled s = 15111.7)
>>>
>>> x before4
>>> + after4.2
>>> N Min Max Median Avg Stddev
>>> x 10 3986994 3991728 3990137.5 3989985.2 1300.0164
>>> + 10 4799990 4806664 4806116.5 4805194 1990.6625
>>> Difference at 95.0% confidence
>>> 815209 +/- 1579.64
>>> 20.4314% +/- 0.0421713%
>>> (Student's t, pooled s = 1681.19)
>>>
>>> Tested by: pho
>>> Reported by: mjg
>>> Approved by: sbruno
>>> Sponsored by: Limelight Networks
>>> Differential Revision: https://reviews.freebsd.org/D15430
>>>
>>> Modified:
>>> head/sys/kern/uipc_usrreq.c
>>> head/sys/sys/unpcb.h
>>>
>>> Modified: head/sys/kern/uipc_usrreq.c
>>> ==============================================================================
>>> --- head/sys/kern/uipc_usrreq.c Thu May 17 17:57:41 2018 (r333743)
>>
>> This commit seems to cause this:
>>
>> 20180606 20:44:37 all (1/1): datagram.sh
>> panic: _mtx_lock_sleep: recursed on non-recursive mutex unp_mtx @ ../../../kern/uipc_usrreq.c:354
>>
>> cpuid = 91
>> time = 1528310678
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00f3a2bcd0
>> vpanic() at vpanic+0x1a3/frame 0xfffffe00f3a2bd30
>> doadump() at doadump/frame 0xfffffe00f3a2bdb0
>> __mtx_lock_sleep() at __mtx_lock_sleep+0x49d/frame 0xfffffe00f3a2be30
>> __mtx_lock_flags() at __mtx_lock_flags+0xf9/frame 0xfffffe00f3a2be80
>> unp_connectat() at unp_connectat+0x2ec/frame 0xfffffe00f3a2c110
>> soconnectat() at soconnectat+0xe8/frame 0xfffffe00f3a2c160
>> kern_connectat() at kern_connectat+0x10f/frame 0xfffffe00f3a2c1c0
>> sys_connect() at sys_connect+0x77/frame 0xfffffe00f3a2c200
>> amd64_syscall() at amd64_syscall+0x2a7/frame 0xfffffe00f3a2c330
>> fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00f3a2c330
>> --- syscall (98, FreeBSD ELF64, sys_connect), rip = 0x80040081a, rsp = 0x7fffffffe2e8, rbp = 0x7fffffffe770 ---
>> KDB: enter: panic
>> [ thread pid 1276 tid 100966 ]
>> Stopped at kdb_enter+0x3b: movq $0,kdb_why
>> db> x/s version
>> version: FreeBSD 12.0-CURRENT #0 r333744: Wed Jun 6 20:35:18 CEST 2018\012 pho at flix1a.netperf.freebsd.org:/usr/src/sys/amd64/compile/PHO\012
>> db>
>
>
> I fixed this in 334175 and 334185.
>
>>
>> The stack trace is different on HEAD:
>>
>> panic: mutex unp not owned at ../../../kern/uipc_usrreq.c:879
>> cpuid = 47
>> time = 1528311629
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00ee593ca0
>> vpanic() at vpanic+0x1a3/frame 0xfffffe00ee593d00
>> panic() at panic+0x43/frame 0xfffffe00ee593d60
>> __mtx_assert() at __mtx_assert+0xb4/frame 0xfffffe00ee593d70
>> uipc_disconnect() at uipc_disconnect+0x93/frame 0xfffffe00ee593dc0
>> soclose() at soclose+0xa5/frame 0xfffffe00ee593e30
>> closef() at closef+0x1f5/frame 0xfffffe00ee593ec0
>> fdescfree_fds() at fdescfree_fds+0x90/frame 0xfffffe00ee593f10
>> fdescfree() at fdescfree+0x480/frame 0xfffffe00ee593fd0
>> exit1() at exit1+0x500/frame 0xfffffe00ee594040
>> sigexit() at sigexit+0xae8/frame 0xfffffe00ee594220
>> postsig() at postsig+0x1cf/frame 0xfffffe00ee5942f0
>> ast() at ast+0x4b8/frame 0xfffffe00ee594330
>> doreti_ast() at doreti_ast+0x1f/frame 0x7fffffffe760
>> KDB: enter: panic
>> [ thread pid 1013 tid 100834 ]
>> Stopped at kdb_enter+0x3b: movq $0,kdb_why
>> db> x/s version
>> version: FreeBSD 12.0-CURRENT #0 r334723: Wed Jun 6 20:53:32 CEST 2018\012 pho at flix1a.netperf.freebsd.org:/usr/src/sys/amd64/compile/PHO\012
>> db>
>
> I'll take a look.
> -M
More information about the svn-src-all
mailing list