Does FreeBSD have sendmmsg or recvmmsg system calls?

Sepherosa Ziehau sepherosa at gmail.com
Wed Jan 13 12:25:31 UTC 2016


On Tue, Jan 12, 2016 at 10:53 PM, Boris Astardzhiev
<boris.astardzhiev at gmail.com> wrote:
> Hello again,
>
> In my spare time I did the following simple libc-only implementation of the
> syscalls.
> I did some tests in a VM adapting these experiments:
> https://blog.cloudflare.com/how-to-receive-a-million-packets/


On Dragonfly, I could do 1.3Mtrans/s (one trans == 18B UDP reception
and then send back) w/o the {recv,send}mmsg() API on an 4C/8T
ivy-bridge i7 easily.  I think only SO_REUSEPORT and cpu hint (Dfly
has  SO_CPUHINT getsockopt) matter in their test.

Thanks,
sephe


>
> Any comments about the diff are greatly appreciated.
>
> Best regards,
> Boris Astardzhiev
>
> On Fri, Jan 8, 2016 at 7:02 PM, Adrian Chadd <adrian.chadd at gmail.com> wrote:
>
>> On 8 January 2016 at 03:02, Bruce Evans <brde at optusnet.com.au> wrote:
>> > On Fri, 8 Jan 2016, Adrian Chadd wrote:
>> >
>> >> On 7 January 2016 at 23:58, Mark Delany <c2h at romeo.emu.st> wrote:
>> >>>
>> >>> On 08Jan16, Bruce Evans allegedly wrote:
>> >>>>
>> >>>> If the NIC can't reach line rate
>> >>>
>> >>>
>> >>>> Network stack overheads are also enormous.
>> >>>
>> >>>
>> >>> Bruce makes some excellent points.
>> >>>
>> >>> I challenge anyone to get line rate UDP out of FBSD (or Linux) for a
>> >>> 1G NIC yet alone a 10G NIC listening to a single port. It was exactly
>> >>> my frustration with UDP performance that led me down the path of
>> >>> *mmsg() and netmap.
>> >>>
>> >>> Frankly this is an opportunity for FBSD as UDP performance appears to
>> >>> be a neglected area.
>> >>
>> >>
>> >> I'm there, on 16 threads.
>> >>
>> >> I'd rather we do it on two or three, as a lot of time is wasted in
>> >> producer/consumer locking. but yeah, 500k tx/rx should be doable per
>> >> CPU with only locking changes.
>>
>> .. and I did mean "kernel producer/consumer locking changes."
>>
>> >
>> > Line rate for 1 Gbps is about 1500 kpps (small packets).
>> >
>> > With I218V2 (em), I see enormous lock contention above 3 or 4 (user)
>> > threads, and 8 are slightly slower than 1.  1 doesn't saturate the NIC,
>> > and 2 is optimal.
>> >
>>
>> The RSS support in -HEAD lets you get away with parallelising UDP
>> streams very nicely.
>>
>> The framework is pretty simple (!):
>>
>> * drivers ask the RSS code for the RSS config and RSS hash to use, and
>> configure the hardware appropriately;
>> * the netisr input paths check the existence of the RSS hash and will
>> calculte it in software if reqiured;
>> * v4/v6 reassembly is done (at the IP level, /not/ at the protocol
>> level) and if it needs a new RSS hash / netisr reinjection, that'll
>> happen;
>> * the PCB lookup code for listen sockets now allows one listen socket
>> per RSS bucket - as the RSS / PCBGROUPS code already extended the PCB
>> to have one PCB table per RSS bucket (as well as a global one);
>>
>> So:
>>
>> * userland code queries RSS for the CPU and RSS bucket setup;
>> * you then create one listen socket per RSS bucket, bind it to the
>> local thread (if you want) and tell it "you're in RSS bucket X";
>> * .. and then in the UDP case for local-bound sockets, the
>> transmit/receive path does not require modifying the global PCB state,
>> so the locking is kept per-RSS bucket, and scales linearly with the
>> number of CPUs you have (until you hit the NIC queue limits.)
>>
>> https://github.com/erikarn/freebsd-rss/
>>
>> and:
>>
>>
>> http://adrianchadd.blogspot.com/2014/06/hacking-on-receive-side-scaling-rss-on.html
>>
>> http://adrianchadd.blogspot.com/2014/07/application-awareness-of-receive-side.html
>>
>> http://adrianchadd.blogspot.com/2014/08/receive-side-scaling-figuring-out-how.html
>>
>> http://adrianchadd.blogspot.com/2014/09/receive-side-scaling-testing-udp.html
>>
>> http://adrianchadd.blogspot.com/2014/10/more-rss-udp-tests-this-time-on-dell.html
>>
>>
>>
>> -adrian
>> _______________________________________________
>> freebsd-net at freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>>
>
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"



-- 
Tomorrow Will Never Die


More information about the freebsd-net mailing list