TSO and FreeBSD vs Linux

Julian Elischer julian at freebsd.org
Wed Aug 14 16:05:12 UTC 2013


On 8/14/13 2:33 PM, Julian Elischer wrote:
> On 8/14/13 11:39 AM, Lawrence Stewart wrote: 

> There's a thing controlled by ethtool called GRO (generic receive
>> offload) which appears to be enabled by default on at least Ubuntu 
>> and I
>> guess other Linux's too. It's responsible for aggregating ACKs and 
>> data
>> to batch them up the stack if the driver doesn't provide a hardware
>> offload implementation. Try rerunning your experiments with the ACK
>> batching disabled on the Linux host to get an additional comparison 
>> point.
> I will try that as soon as I get back to the machines in question.

turning on and off GRO seems to make no difference, either at the 
overall throughput level or at the
low level packet-by-packet level (according to tcptrace).

>>> for two examples look at:
>>>
>>>
>>> http://www.freebsd.org/~julian/LvsF-tcp-start.tiff
>>> and
>>> http://www.freebsd.org/~julian/LvsF-tcp.tiff
>>>
>>> in each case, we can see FreeBSD on the left and Linux on the right.
>>>
>>> The first case shows the case as the sessions start, and the 
>>> second case
>>> shows
>>> some distance later (when the sequence numbers wrap around.. no 
>>> particular
>>> reason to use that, it was just fun to see).
>>> In both cases you can see that each Linux packet (white)(once they 
>>> have got
>>> going) is responding to multiple bumps in the send window sequence
>>> number (green and yellow lines) (representing the arrival of 
>>> several ACKs)
>>> while FreeBSD produces a whole bunch of smaller packets, slavishly
>>> following
>>> exactly the size of each incoming ack.. This gives us quite  a
>>> performance debt.
>> Again, please s/performance/what-you-really-mean/ here.
> ok, In my tests this makes FreeBSD data transfers much slower, by as 
> much as 60%.
>>
>>> Notice that this behaviour in Linux seems to be modal.. it seems to
>>> 'switch on' a little bit
>>> into the 'starting' trace.
>>>
>>> In addition, you can see also that Linux gets going faster even in 
>>> the
>>> beginning where
>>> TSO isn't in play, by sending a lot more packets up-front. (of course
>>> the wisdom of this
>>> can be argued).
>> They switched to using an initial window of 10 segments some time ago.
>> FreeBSD starts with 3 or more recently, 10 if you're running recent
>> 9-STABLE or 10-CURRENT.
> I tried setting initial values as shown:
>   net.inet.tcp.local_slowstart_flightsize: 10
>   net.inet.tcp.slowstart_flightsize: 10
> it didn't seem to make too much difference but I will redo the test.
>
>>
>>> Has anyone done any work on aggregating ACKs, or delaying 
>>> responding to
>>> them?
>> As noted by Navdeep, we already have the code to aggregate ACKs in our
>> software LRO implementation. The bigger problem is that appropriate 
>> byte
>> counting places a default 2*MSS limit on the amount of ACKed data the
>> window can grow by i.e. if an ACK for 64k of data comes up the stack,
>> we'll grow the window by 2 segments worth of data in response. That
>> needs to be addressed - we could send the ACK count up with the
>> aggregated single ACK or just ignore abc_l_var when LRO is in use 
>> for a
>> connection.
> so, does "Software LRO" mean that LRO on hte NIC should be ON or OFF 
> to see this?
>
>
>>
>> Cheers,
>> Lawrence
>>
>>
>
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>



More information about the freebsd-net mailing list