svn commit: r194672 - in head/sys: kern netinet sys
Andrew Gallatin
gallatin at cs.duke.edu
Tue Jun 23 13:43:49 UTC 2009
Andre Oppermann wrote:
> Add soreceive_stream(), an optimized version of soreceive() for
> stream (TCP) sockets.
<....>
>
> Testers, especially with 10GigE gear, are welcome.
Awesome! On my very weak, ancient consumer grade athlon64 test
machine (AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2050.16-MHz
K8-class CPU)) using mxge and LRO, I see a roughly 700Mb/s increase in
bandwidth from 7.7Gb/s to 8.4Gb/s. For what its worth, this finally
gives FreeBSD performance parity with Linux on this hardware for
10GbE single-stream receive.
TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to venice-my
(192.168.1.15) port 0 AF_INET
Recv Send Send Utilization Service
Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local
remote
bytes bytes bytes secs. 10^6bits/s % S % C us/KB us/KB
before:
65536 65536 65536 60.01 7709.14 13.30 79.60 0.283
1.692
after:
65536 65536 65536 60.01 8403.86 14.66 81.63 0.286
1.592
This is consistent across runs. Lockstat output for 10 seconds in the
middle of a run is very interesting and shows a huge reduction in
lock contention.
Before:
Adaptive mutex spin: 369333 events in 10.017 seconds (36869 events/sec)
Count indv cuml rcnt nsec Lock Caller
-------------------------------------------------------------------------------
303685 82% 82% 0.00 1080 0xffffff000f2f98d0 recvit+0x21
63847 17% 100% 0.00 25 0xffffff000f2f98d0 ip_input+0xad
1788 0% 100% 0.00 172 0xffffff0001c57c08
intr_event_execute_handlers+0x100
8 0% 100% 0.00 389 vm_page_queue_mtx trap+0x4ce
1 0% 100% 0.00 30 0xffffff8000251598 ithread_loop+0x8e
1 0% 100% 0.00 720 0xffffff8000251598
uhub_read_port_status+0x2d
1 0% 100% 0.00 1639 0xffffff000f477190 vm_fault+0x112
1 0% 100% 0.00 1 0xffffff001fecce10 mxge_intr+0x425
1 0% 100% 0.00 1332 0xffffff0001845600
clnt_reconnect_call+0x105
-------------------------------------------------------------------------------
Adaptive mutex block: 89 events in 10.017 seconds (9 events/sec)
Count indv cuml rcnt nsec Lock Caller
-------------------------------------------------------------------------------
83 93% 93% 0.00 20908 0xffffff000f2f98d0 tcp_input+0xd96
3 3% 97% 0.00 45234 0xffffff8000259f08 fork_exit+0x118
3 3% 100% 0.00 44862 0xffffff8000251598 fork_exit+0x118
-------------------------------------------------------------------------------
After:
Adaptive mutex spin: 105102 events in 10.020 seconds (10490 events/sec)
Count indv cuml rcnt nsec Lock Caller
-------------------------------------------------------------------------------
75886 72% 72% 0.00 2860 0xffffff0001fdde20 ip_input+0xad
28418 27% 99% 0.00 1355 0xffffff0001fdde20 recvit+0x21
779 1% 100% 0.00 171 0xffffff0001642808
intr_event_execute_handlers+0x100
7 0% 100% 0.00 670 vm_page_queue_mtx trap+0x4ce
5 0% 100% 0.00 46 0xffffff001fecce10 mxge_intr+0x425
1 0% 100% 0.00 105 vm_page_queue_mtx trap_pfault+0x142
1 0% 100% 0.00 568 0xffffff8000251598 usb_process+0xd8
1 0% 100% 0.00 880 0xffffff8000251598 ithread_loop+0x8e
1 0% 100% 0.00 233 0xffffff001a224578 vm_fault+0x112
1 0% 100% 0.00 60 0xffffff001a1759b8 syscall+0x28f
1 0% 100% 0.00 809 0xffffff0001846000
clnt_reconnect_call+0x105
1 0% 100% 0.00 1139 0xffffff0001fdde20 kern_recvit+0x1d4
-------------------------------------------------------------------------------
Adaptive mutex block: 88 events in 10.020 seconds (9 events/sec)
Count indv cuml rcnt nsec Lock Caller
-------------------------------------------------------------------------------
80 91% 91% 0.00 25891 0xffffff0001fdde20 tcp_input+0xd96
3 3% 94% 0.00 45979 0xffffff8000259f08 fork_exit+0x118
3 3% 98% 0.00 45886 0xffffff8000251598 fork_exit+0x118
1 1% 99% 0.00 38254 0xffffff8000259f08
intr_event_execute_handlers+0x100
1 1% 100% 0.00 79858 0xffffff001a1760f8 kern_wait+0x7ee
-------------------------------------------------------------------------------
Drew
More information about the svn-src-all
mailing list