reno cwnd growth while app limited...

Randall Stewart rrs at netflix.com
Thu Sep 12 19:41:24 UTC 2019


Richard

Yes that is something one could do i.e. NewCWV… we had patches for it at
one time but it was largely untested.

Pacing of course changes all of this as well.. i.e. BBR for example does not worry about
this since its pacing the packets so you never get a burst. We will soon be committing
and updated rack stack that also has this ability

R

> On Sep 12, 2019, at 2:49 PM, Scheffenegger, Richard <Richard.Scheffenegger at netapp.com> wrote:
> 
> Michael,
>  
> Thanks a lot for pointing out the uperf utility - this could be configured exactly in the way I wanted to demonstrate this...
>  
> In the below graphs, I traced the evolution of cwnd for a flow across the loopback in a VM.
>  
> The application is doing 3060 times 10kB writes, 10ms pausing (well below the 230ms minimum TCP idle period), and once that phase is over, floods the session with 8x writes of 10MB each.
>  
> Currently, the stack will initially grow cwnd up to the limit set by the receiver's window (set to 1.2 MB) - during the low bandwidth rate phase, where no loss occurs...
>  
> Thus the application can send out a massive burst of data in a single RTT (or at linerate) when it chooses to do so...
>  
>  
> Using the guidance given by NewCWV (RFC7661), and growing cwnd only, when flightsize is larger than half of cwnd, the congestion window remains in more reasonable ranges during the application limited phase, thus limiting the maximum burst size.
>  
> Growth of cwnd in SS or CA otherwise is normal, but the inverse case (application transisioning from high throughput to low) is not addressed; but I wonder if a reduction could be achieved without the timer infrastructure described in 7661 (e.g. reducing cwnd by 1 mss, when flightsize is < ½ cwnd, while not doing recovery…
>  
> <image004.png>
> Unlimited ssthresh:
> <image005.png>
>  
>  
> <image009.png>
>  
>  
> Richard Scheffenegger
> Consulting Solution Architect
> NAS & Networking
>  
> NetApp
> +43 1 3676 811 3157 Direct Phone
> +43 664 8866 1857 Mobile Phone
> Richard.Scheffenegger at netapp.com
>  
> https://ts.la/richard49892
>  
>  
> -----Original Message-----
> From: Randall Stewart <rrs at netflix.com> 
> Sent: Mittwoch, 11. September 2019 14:18
> To: Scheffenegger, Richard <Richard.Scheffenegger at netapp.com>
> Cc: Lawrence Stewart <lstewart at netflix.com>; Michael Tuexen <tuexen at FreeBSD.org>; Jonathan Looney <jtl at netflix.com>; freebsd-transport at freebsd.org; Cui, Cheng <Cheng.Cui at netapp.com>; Tom Jones <thj at freebsd.org>; bz at freebsd.org; Eggert, Lars <lars at netapp.com>
> Subject: Re: reno cwnd growth while app limited...
>  
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>  
>  
>  
>  
> Interesting graph :)
>  
>  
> I know that years ago I had a discussion along these lines (talking about burst-limits) with
> Kacheong Poon and Mark Allman. IIRR Kacheong said, at that time, sun limited the cwnd to
> something like 4MSS more than the flight size (I could have that mixed up though and it might
> have been Mark proposing that.. its been a while sun was still a company then :D).
>  
> On the other hand I am not sure that such a tight limit takes into account all of the ack-artifacts that
> seem to be rabid in the internet now..  BBR took the approach of limiting its cwnd to 2xBDP (or at
> least what it thought was the BDP).. which is more along the lines of your .5 if I am reading you right.
>  
> It might be something worth looking into but I would want to contemplate it for a while :)
>  
> R
>  
> > On Sep 11, 2019, at 8:04 AM, Scheffenegger, Richard <Richard.Scheffenegger at netapp.com> wrote:
> > 
> > Hi,
> > 
> > I was just looking at some graph data running two parallel dctcp flows against a cubic receiver (some internal validation) with traditional ecn feedback.
> > 
> > <image002.jpg>
> > 
> > 
> > Now, in the beginning, a single flow can not overutilize the link capacity, and never runs into any loss/mark… but the snd_cwnd grows unbounded (since DCTCP is using the newreno “cc_ack_received” mechanism).
> > 
> > However, newreno_ack_received is only to grow snd_cwnd, when CCF_CWND_LIMITED is set, which remains set as long as snd_cwnd < snd_wnd (the receiver signaled receive-window).
> > 
> > But is this still* the correct behavior?
> > 
> > Say, the data flow rate is application limited (ever n milliseconds, a few kB), and the receiver has a large window signalled – cwnd will grow until it matches the receivers window. If then the application chooses to no longer restrict itself, it would possibly burst out significantly more data than the queuing of the path can handle…
> > 
> > So, shouldn’t there be a second condition for cwnd growth, that e.g. pipe (flightsize) is close to cwnd (factor 0.5 during slow start, and say 0.85 during congestion avoidance), to prevent sudden large bursts when a flow comes out of being application limited? The intention here would be to restrict the worst case burst that could be sent out (which is dealt will differently in other stacks), to ideally still fit into the path’s queues…
> > 
> > RFC5681 is silent on application limited flows though (but one could thing of application limiting a flow being another form of congestion, during which cwnd shouldn’t grow…)
> > 
> > In the example above, growing cwnd up to about 500 kB and then remaining there should be approximately the expected setting – based on the average of two competing flows hovering at aroud 200-250 kB…
> > 
> > *) I’m referring to the much higher likelihood nowadays, that the application itself pacing and transfer volume violates the design principle of TCP, where the implicit assumption was that the sender has unlimited data to send, with the timing controlled at the full disgression of TCP.
> > 
> > 
> > Richard Scheffenegger
> > Consulting Solution Architect
> > NAS & Networking
> > 
> > NetApp
> > +43 1 3676 811 3157 Direct Phone
> > +43 664 8866 1857 Mobile Phone
> > Richard.Scheffenegger at netapp.com
> > 
> > 
> > <image004.jpg>
> > 
> > <image006.jpg> <image012.jpg>
> >  #DataDriven
> > 
> > https://ts.la/richard49892
>  
> ------
> Randall Stewart
> rrs at netflix.com

------
Randall Stewart
rrs at netflix.com





More information about the freebsd-transport mailing list