Support for zero copy sockets
Navdeep Parhar
np at FreeBSD.org
Mon Aug 11 19:11:54 UTC 2014
There is zero copy receive (aka Direct Data Placement -- DDP) in the TOE
driver that accompanies cxgbe(4). I have a tx zero copy implementation
for it as well (this is not in -current right now). But all this code
is chip specific and applies only to TCP connections that are handled
by the TOE driver. It doesn't rely on COW or page flipping.
The reason I'm mentioning all of this here is that if anyone is thinking
of working on proper zero copy awareness (and APIs) at the socket layer
then count me in as an interested party.
Regards,
Navdeep
On 08/11/14 11:34, Alan Cox wrote:
> The send path used an ad hoc copy-on-write mechanism, i.e., it was not the
> mechanism used by fork, etc. This mechanism was broken (and as I'll argue
> in a few sentences not worth fixing). The receive path used page flipping
> and required support from the NIC. Neither copy-on-write nor page flipping
> are viable approaches on today's multicore machines because their
> implementation entails interprocessor TLB shootdowns. Let them rest in
> peace.
>
>
>
>
> On Mon, Aug 11, 2014 at 1:04 PM, Adrian Chadd <adrian at freebsd.org> wrote:
>
>> On 11 August 2014 01:26, Victor Balada Diaz <victor at bsdes.net> wrote:
>>> On Mon, Aug 04, 2014 at 10:00:16AM -0700, Sushanth Rai via
>> freebsd-hackers wrote:
>>>> Hello,
>>>>
>>>> FreeBSD 10 release sources doesn't seem to have zero copy socket code
>> anymore. What's is alternative to do zero_copy ?
>>>>
>>>> Thanks,
>>>> Sushanth
>>>
>>> You need to use sendfile(2). In the man page is stated that the
>> implementation in FreeBSD
>>> is zero copy.
>>>
>>> You can also check:
>>>
>>> http://svnweb.freebsd.org/base?view=revision&revision=255608
>>>
>>
>> I'd like to reintroduce a zero copy socket IO method for at least
>> write that doesn't rely on sendfile.
>>
>> The zero-copy socket page flipping thing was interesting because IIRC
>> tried to work for both sending and receiving socket data. Doing that
>> via an API would be nicer.
>>
>> So, if people have an idea for how it could be done / what the API
>> looks like then I'm all ears.
>>
>>
>>
>> -a
>> _______________________________________________
>> freebsd-hackers at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"
>>
> _______________________________________________
> freebsd-hackers at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"
>
More information about the freebsd-hackers
mailing list