Ethernet Drivers: Question on Sending Received Packets to the
FreeBSD Network Stack
rwatson at FreeBSD.org
Sat Feb 5 05:54:40 UTC 2011
On Thu, 3 Feb 2011, Julian Elischer wrote:
> On 2/3/11 10:08 AM, David Somayajulu wrote:
>> Hi All, While sending the Received Ethernet Frames (non - LRO case) to the
>> FreeBSD Network Stack via (struct ifnet *)->if_input((struct ifnet *),
>> (struct *mbuf));
>> Is it possible to send multiple Ethernet Frames in a single invocation of
>> the above callback function?
>> In other words should (struct *mbuf) above always correspond to a single
>> Ethernet Frame? I am not sure if I missed something, but I gathered from a
>> quick perusal of ether_input() in net/if_ethersubr.c, that only ONE
>> Ethernet Frame may be sent per callback.
> yes only one. the linkages you see in the mbuf definition are for when you
> are putting it into some queue (interface, socket, reassembly, etc).
> I had never considered passing a set of packets, but after my initial
> scoffing thoughts I realized that it would actually be a very interesting
> thought experiment to see if the ability to do that would be advantageous in
> any way. I tmay be a way to reduce some sorts of overhead if using interrupt
This was discussed quite a lot at the network-related devsummit sessions a few
years ago. One idea that was bandied about was introducing an mbuf vector
data structure (I'm sure at least two floated around in Perforce at some
point, and another ended up built into the Chelsio driver I think). The idea
being that indirection through queues as with mbufs is quite inefficient when
you want to pass them around in sets when the set may not be in the cache.
Instead, vectors of mbuf pointers would be passed around, each entry being a
chain representing a packet.
I think one reason that idea never really went anywhere was that the use cases
were fairly artificial for anything other than link layer bridging between
exactly two interfaces or systems with exactly one high-volume TCP connection.
In most scenarios, packets may come in small bursts going to the same
destination (etc), as is exploited by LRO, but not in a way that you're able
to maintain passing around in sets that remain viable as you get above the
link layer. I seem to recall benchmarking one of the prototypes and finding
that it increased the working set on memory noticeably since it effectively
meant much more queueing was taking place, whereas our current direct dispatch
model helped latency a great deal. It could be that with more deferred
dispatch in a parallel setting, it helps, but you'd definitely want to take a
measurement-oriented approach in looking at it any further.
(For bridged ethernet filtering devices that act as a "bump in the wire", it
might well prove a reasonable performance optimisation. I'm not sure if the
cxgb mvec implementation would be appropriate to this task or not.)
More information about the freebsd-current