Network Recieve Performance Working Group
George Neville-Neil
gnn at neville-neil.com
Sun Jun 9 21:04:47 UTC 2013
Howdy,
At the Network Receive Performance working group at BSDCan we covered a narrower set of topics
than we normally do, which seems to have resulted in a reasonably sized work list for improving
our systems in this area. The main issues relate to getting a good API that addresses multi-queue
NICs. The notes are on the WIki page as well as reproduced here.
Best,
George
https://wiki.freebsd.org/201305DevSummit/NetworkReceivePerformance
The discussion opened with an attempt to constrain the problem we were trying to solve, including pointing out that any KPI/API suggested needed to be achievable in the next six months.
Some of the existing solutions to the problem of talking to hardware with multiple queues, which all high end NICs currently have, were:
• Connection Groups
• Not really a KPI
• RSS vs. Flow Table is an issue to solve, we have things for the former, but little for the latter
• Socket affinity is also an issue
• NAPI
• This is an APi in Linux. It uses upcalls.
• Flow table mapping. Chelsio may have some of this.
• SRIO
• VLL Cloner
There are several ways to map flows, including: 4 tuple, MAC filter, arbitrary offset. An API that only handles offset, length, value is too simple from the standpoint of getting the right data into the hardware. We need something more rich on the kernel side of the API to that driver writers don't have to figure out our intentions.
Some methods that a good KPI/API ought to have include:
• Query Device for information about its queues, including how many exist, and how they are mapped to other resources, including CPU and memory
• Map CPUID to a Flow
• Setup RSS
• Request RxRing local memory
• Solaris Mapping API might be a way to go (http://www.oracle.com/technetwork/articles/servers-storage-admin/crossbowsetup-191326.pdf)
•
Some consumers of such an API include: Performance, affinity, virtualization, policy, kernel bypass, QoS, and VIMAGE.
We have two patches, for different bits, to start from including Vijay's [RobertWatson] and Randall's [RandallStewart], [GeorgeNevilleNeil]
We need quite a few things, including:
• Per connection flow table
• Describing queues in the stack such that we can expose interesting parts via netstat.
• Packet Batching. This was not overwhelmingly popular.
A straw person API includes:
• MBUF Flag
• Hash Value
• The whole thing may be used as opaque
• Used by the stack for inpcb
• Get number of buckets
• Map bucket to RSS
• Map queue/ithread to CPU
• Get width of the hash
• RSS get CPU
• RSS get hash algo
• Pick hash inputs
• Get and set key
• Rebalance
• Software hash table
• Query queue length
• Get queue affinity
• Set mask (CPUSET) on socket
• Set policy on CPU/socket
• Queue event reporting
• Load distrubtion stats
More information about the freebsd-hackers
mailing list