Adding Flow Director sysctls to ixgbe(4)

Takuya ASADA syuu at dokukino.com
Fri Sep 23 13:59:38 UTC 2011


Hi,

On Sep 9, 2011, at 10:56 AM, owner-freebsd-net at freebsd.org wrote:

> On Fri, Sep 09, 2011 at 01:44:34AM +0100, Ben Hutchings wrote:
>> On Thu, 2011-09-08 at 20:13 -0400, George Neville-Neil wrote:
>>> On Sep 8, 2011, at 14:49 , Navdeep Parhar wrote:
>>> 
>>>> On Thu, Sep 08, 2011 at 08:34:11AM -0400, John Baldwin wrote:
>>>>> On Monday, September 05, 2011 7:21:12 am Ben Hutchings wrote:
>>>>>> On Mon, 2011-09-05 at 15:51 +0900, Takuya ASADA wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I implemented Ethernet Flow Director sysctls to ixgbe(4), here's a detail:
>>>>>>> 
>>>>>>> - Adding removing signature filter
>>>>>>> On linux version of ixgbe driver, it has ability to set/remove perfect
>>>>>>> filter from userland using ethtool command.
>>>>>>> I implemented similar feature, but on sysctl, and not perfect filter
>>>>>>> but signature filter(which means hash collision may occurs).
>>>>>> [...]
>>>>>> 
>>>>>> Linux also has a generic interface to RX filtering and hashing
>>>>>> (ethtool_rxnfc) which ixgbe supports; wouldn't it be better for FreeBSD
>>>>>> to support something like that?
>>>>> 
>>>>> Some sort of shared interface might be nice.  The cxgb(4) and cxgbe(4) drivers
>>>>> both provide their own tools to manipulate filters, though they do not
>>>>> provide explicit steering IIRC.
>>>> 
>>>> Both of them can filter as well as steer (and the tools let you do that).
>>>> cxgbe(4) can do a lot more (rewrite + switch, replicate, etc.) but those
>>>> features are perhaps too specialized to be configurable via a general
>>>> purpose tool.
>>>> 
>>>>> 
>>>>> We would need to come up with some sort of standard interface (ioctls?) for 
>>>>> adding filters however.
>>>> 
>>>> +1 for a standard interface.
>>>> 
>>>> imho the kernel needs to be aware of the rx and tx queues of a NIC, and
>>>> not just for steering.  But that's a separate discussion.
>>>> 
>>> 
>>> Well I do think this is actually all of a part.  Most of us realize by now that
>>> high speed (e.g. 10G and higher) NICs only make sense if you can steer traffic and
>>> pin queues to cores etc.
>> 
>> Well, you can get way better than 1G performance without that.  And for
>> routers, flow hashing may be fine.  But for a host, of course, steering
>> packets properly can provide a major performance win.
>> 
>> [...]
>>> What this means is that we have
>>> a failure of abstraction.  Abstraction has a cost, and some of the people who want
>>> access to low level queues are not interested in paying an extra abstraction cost.
>> 
>> Abstraction has a cost, but it's not necessarily that high compared to
>> rewriting a whole chunk of sockets code (especially if you don't
>> actually have the source code).
>> 
>>> I think that some of the abstractions we need are tied up in the work that Takuya did
>>> for SoC and some of it is in the work done by Luigi on netmap.  I'd go so far as to say
>>> that what we should do is try to combine those two pieces of code into a set of
>>> low level APIs for programs to interact with high speed NICs.  The one thing most
>>> people do not talk about is extending our socket API to do two things that I think would
>>> be a win for 80% of our users.  If a socket, and also a kqueue, could be pinned
>>> to a CPU as well as a NIC queue that should improve overall bandwidth for a large
>>> number of our users.  The API there is definitely an ioctl() and the hard part is
>>> doing the tying together.  To do this we need to also work out our low level story.
>> 
>> But it would be a lot nicer if this could be done automatically.  Which
>> I believe it can - see the RFS and XPS features in Linux.
> 
> rwatson@ has been working on "connection groups" (not sure what he calls
> his project) with a goal to improve the placement of work in the FreeBSD
> network stack.  Some of the code is in the kernel but the parts that
> require closer cooperation with a NIC are not.

It looks like reducing lock contention on inpcb lookup, does it even effects the other part? (ex: CPU affinity


More information about the freebsd-net mailing list