route/arp lifetime (Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux))

Alexander V. Chernikov melifaro at ipfw.ru
Wed Aug 14 12:17:18 UTC 2013


On 14.08.2013 16:05, Luigi Rizzo wrote:
> On Wed, Aug 14, 2013 at 03:47:13PM +0400, Lev Serebryakov wrote:
>> Hello, Luigi.
>> You wrote 14 ?????????????? 2013 ??., 14:21:09:
>>
>> LR> Then the problem remains that we should keep a copy of route and
>> LR> arp information in the socket instead of redoing the lookups on
>> LR> every single transmission, as they consume some 25% of the time of
>> LR> a sendto(), and probably even more when it comes to large tcp
>> LR> segments, sendfile() and the like.
>>    And we should invalidate this info on ARP/route changes, or connection
>>   will be lost in such cases, am I right?.. So, on each such event code
>>   should look into all sockets and check, if routing/ARP information is still
>>   valid for them. Or we should store lists of sockets in routing and ARP
>>   tables... I don't know, what is worse.
> I think we should start by acknowledging that routing and ARP
> information is inherently stale, and changes unfrequently.
> So it is not a disaster if we have incorrect information for some
> short amount of time (milliseconds) because in the end the remote
> party that decides to change it and inform us may take much longer
> than that to distribute the update.
You can save rte&arp, however doing this
gives you perfect chance to crash your kernel if egress interface is 
destroyed (like vlan or ng or tun).
>
>
> Considering that each lookup takes between 100..300ns if you are
> lucky (not many misses, relatively empty table etc.), one could
> reasonably do the lookup at most once per millisecond or so (just
> reading 'ticks', no need for a nanotime() if you have a slow clock),
> or whenever we get an error related to the socket, either in the
> forward path (e.g. ifp points to an interface that is down) or in
> the reverse path (e.g. a dupack because we sent a packet to the
> wrong place).
This sounds like "Hey, the kernel lookup is slow (which is true), let's 
make a hack and don't bother lookups".
This approach gives us mtx-locked rte refcounts which are used (misused) 
in many places making things worse and decreasing the ability to fix the 
things up..
>
> cheers
> luigi
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>



More information about the freebsd-net mailing list