[PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

Oleg Moskalenko mom040267 at gmail.com
Mon Dec 2 04:29:25 UTC 2013


Sepherosa, while reading your description I noticed another long-standing
problem for UDP application developers: the UDP sockets are always hashed
with 2-tuple. But UDP sockets can be "connected", too, to a remote address,
with connect(...) function. Unfortunately, with 2-tuple hashing, that
pattern is useless for large-scale applications: if a large number of UDP
sockets on the same local port are "connected" to remote address, then the
kernel have to go thru the long list of UDP sockets with the same hash
value.

If the connected UDP sockets would use 4-tuples, then it would be very
helpful for the new generation of the UDP-based media applications. For
example, servers which use DTLS protocol would become simpler and more
efficient.

Thanks
Oleg



On Sun, Dec 1, 2013 at 8:17 PM, Sepherosa Ziehau <sepherosa at gmail.com>wrote:

>
>
>
> On Sat, Nov 30, 2013 at 2:42 AM, Ermal Luçi <eri at freebsd.org> wrote:
>
>> Well seems Dragonfly has some version of it already from commit [1].
>>
>>
> The distribution algorithm was changed a little bit after initial commit
> to gain more idle time (bnx(4) output has already been maxed out):
>
> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/c275f18d832361be28b150d3f4fd518914bdeba6
>
> Well, I also addressed a reasonable concern from nginx folks (I am not
> quite sure about Linux's position on it; Linux original implementation of
> SO_REUSEPORT from Google had this drawback, which I mentioned in the commit
> message):
>
> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/02ad2f0b874fb0a45eb69750219f79f5e8982272
>
> As about nginx, SO_REUSEPORT patch for nginx (both 1.4.x and 1.5.x) is in
> dports; should be easier to be back ported to FreeBSD's ports.  I failed to
> convince nginx folks to merge it into mainline and I am currently onto
> other stuffs, will come back to them later.  If FreeBSD is going to
> implement Linux's style of SO_REUSEPORT, pushing the patch to the nginx
> mainline will be easier.
>
> I also put up a brief description of SO_REUSEPORT in dfly; may be useful
> to you:
> http://leaf.dragonflybsd.org/~sephe/netisr_so_reuseport.txt
>
> Best Regards,
> sephe
>
>
>> In FreeBSD there is the framework for this with by defining PCBGROUP.
>> Also the explanation of it at [2] and [3].
>> It can achieve approximately the same features of SO_RESUSEPORT of linux.
>> The only thing missing is the marketing behind it and i think and better
>> RSS support.
>> By looking at dates the support is there before linux so all you guys
>> looking for it can experiment with it.
>>
>> What i was trying to accomplish was something else from performance
>> improvement and
>> maybe put a sysctl behind it to make it more acceptable..
>>
>> [1]
>>
>> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9c021abb8197718d7a2d441c9
>> [2]
>> http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=bigexcerpts#L51
>> [3] http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.html
>>
>>
>> On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko <mom040267 at gmail.com
>> >wrote:
>>
>> > Tim, you are wrong. Read what is "multicast" definition, and read how
>> UDP
>> > and TCP sockets work in Linux 3.9+ kernels.
>> >
>> > Oleg .
>> >
>> >
>> > On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle <kientzle at freebsd.org
>> >wrote:
>> >
>> >>
>> >> On Nov 29, 2013, at 4:04 AM, Ermal Luçi <eri at freebsd.org> wrote:
>> >>
>> >> > Hello,
>> >> >
>> >> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two
>> daemons to
>> >> > share the same port and possibly listening ip …
>> >>
>> >> These flags are used with TCP-based servers.
>> >>
>> >> I’ve used them to make software upgrades go more smoothly.
>> >> Without them, the following often happens:
>> >>
>> >> * Old server stops.  In the process, all of its TCP connections are
>> >> closed.
>> >>
>> >> * Connections to old server remain in the TCP connection table until
>> the
>> >> remote end can acknowledge.
>> >>
>> >> * New server starts.
>> >>
>> >> * New server tries to open port but fails because that port is “still
>> in
>> >> use” by connections in the TCP connection table.
>> >>
>> >> With these flags, the new server can open the port even though
>> >> it is “still in use” by existing connections.
>> >>
>> >>
>> >> > This is not the case today.
>> >> > Only multicast sockets seem to have the behaviour of broadcasting the
>> >> data
>> >> > to all sockets sharing the same properties through these options!
>> >>
>> >> That is what multicast is for.
>> >>
>> >> If you want the same data sent to all listeners, then
>> >> that is multicast behavior and you should be using
>> >> a multicast socket.
>> >>
>> >> > The patch at [1] implements/corrects the behaviour for UDP sockets.
>> >>
>> >> You’re trying to turn all UDP sockets with those options
>> >> into multicast sockets.
>> >>
>> >> If you want a multicast socket, you should ask for one.
>> >>
>> >> Tim
>> >>
>> >> _______________________________________________
>> >> freebsd-net at freebsd.org mailing list
>> >> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> >> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>> >>
>> >
>> >
>>
>>
>> --
>> Ermal
>> _______________________________________________
>> freebsd-current at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org
>> "
>>
>
>
>
> --
> Tomorrow Will Never Die
>


More information about the freebsd-net mailing list