[PATCH] Add a new TCP_IGNOREIDLE socket option

Lawrence Stewart lstewart at freebsd.org
Wed Feb 13 14:01:23 UTC 2013


On 02/10/13 16:05, Kevin Oberman wrote:
> On Sat, Feb 9, 2013 at 6:41 AM, Alfred Perlstein <bright at mu.org> wrote:
>> On 2/7/13 12:04 PM, George Neville-Neil wrote:
>>>
>>> On Feb 6, 2013, at 12:28 , Alfred Perlstein <bright at mu.org> wrote:
>>>
>>>> On 2/6/13 4:46 AM, John Baldwin wrote:
>>>>>
>>>>> On Wednesday, February 06, 2013 6:27:04 am Randall Stewart wrote:
>>>>>>
>>>>>> John:
>>>>>>
>>>>>> A burst at line rate will *often* cause drops. This is because
>>>>>> router queues are at a finite size. Also such a burst (especially
>>>>>> on a long delay bandwidth network) cause your RTT to increase even
>>>>>> if there is no drop which is going to hurt you as well.
>>>>>>
>>>>>> A SHOULD in an RFC says you really really really really need to do it
>>>>>> unless there is some thing that makes you willing to override it. It is
>>>>>> slight wiggle room.
>>>>>>
>>>>>> In this I agree with Andre, we should not be *not* doing it. Otherwise
>>>>>> folks will be turning this on and it is plain wrong. It may be fine
>>>>>> for your network but I would not want to see it in FreeBSD.
>>>>>>
>>>>>> In my testing here at home I have put back into our stack max-burst.
>>>>>> This
>>>>>> uses Mark Allman's version (not Kacheong Poon's) where you clamp the
>>>>>> cwnd at
>>>>>> no more than 4 packets larger than your flight. All of my testing
>>>>>> high-bw-delay or lan has shown this to improve TCP performance. This
>>>>>> is because it helps you avoid bursting out so many packets that you
>>>>>> overflow
>>>>>> a queue.
>>>>>>
>>>>>> In your long-delay bw link if you do burst out too many (and you never
>>>>>> know how many that is since you can not predict how full all those
>>>>>> MPLS queues are or how big they are) you will really hurt yourself even
>>>>>> worse.
>>>>>> Note that generally in Cisco routers the default queue size is
>>>>>> somewhere between
>>>>>> 100-300 packets depending on the router.
>>>>>
>>>>> Due to the way our application works this never happens, but I am fine
>>>>> with
>>>>> just keeping this patch private.  If there are other shops that need
>>>>> this they
>>>>> can always dig the patch up from the archives.
>>>>>
>>>> This is yet another time when I'm sad about how things happen in FreeBSD.
>>>>
>>>> A developer come forward with a non-default option that's very useful for
>>>> some specific workloads, specifically one that contributes much time and $$$
>>>> to the project and the community rejects the patches even though it's been
>>>> successful in other OSes.
>>>>
>>>> It makes zero sense.
>>>>
>>>> John, can you repost the patch?  Maybe there is a way to refactor this
>>>> somehow so it's like accept filters where we can plug in a hook for TCP?
>>>>
>>>> I am very disappointed, but not surprised.
>>>>
>>> I take away the complete opposite feeling.  This is how we work through
>>> these issues.
>>> It's clear from the discussion that this need not be a default in the
>>> system,
>>> and is a special case.  We had a reasoned discussion of what would be best
>>> to do
>>> and at least two experts in TCP weighed in on the effect this change might
>>> have.
>>>
>>> Not everything proposed by a developer need go into the tree, in
>>> particular since these
>>> discussions are archived we can always revisit this later.
>>>
>>> This is exactly how collaborative development should look, whether or not
>>> the patch
>>> is integrated now, next week, next year, or ever.
>>
>>
>> I agree that discussion is great, we have all learned quite a bit from it,
>> about TCP and the dangers of adjusting buffering without considerable
>> thought.  I would not be involved in FreeBSD had this type of discussion and
>> information not be discussed on the lists so readily.
>>
>> However, the end result must be far different than what has occurred so far.
>>
>> If the code was deemed unacceptable for general inclusion, then we must find
>> a way to provide a light framework to accomplish the needs of the community
>> member.
>>
>> Take for instance someone who is starting a company that needs this
>> facility.  Which OS will they choose?  One who has integrated a useful
>> feature?  Or one who has rejected it and left that code in the mailing list
>> archives?
>>
>> As much as expert opinion is valuable, it must include understanding and
>> need of handling special cases and the ability to facilitate those special
>> cases for our users and developers.
> 
> This is a subject rather near to my heart, having fought battles with
> congestion back in the dark days of Windows when it essentially
> defaulted to TCPIGNOREIDLE. It was a huge pain, but it was the only
> way Windows did TCP in the early days. It simply did not implement
> slow-start. This was really evil, but in the days when lots of links
> were 56K and T-1 was mostly used for network core links, the Internet,
> small as it was back then, did not melt, though it glowed a
> frightening shade of red fairly often. Today too many systems running
> like this would melt thins very quickly.
> 
> OTOH, I can certainly see cases, like John's,  where it would be very
> beneficial. And, yes, Linux has it. (I don't see this a relevant in
> any way except as proof tat not enough people have turned it on to
> cause serious problems... yet!) It seems a shame to make everyone who
> really has a need develop their own patches or dig though old mail to
> find John's.
> 
> What I would like to see is a way to have it available, but make it
> unlikely to be enabled except in a way that would put up flashing red
> warnings and sound sirens to warn people that it is very dangerous and
> can be a way to blow off a few of one's own toes.
> 
> One idea that popped into my head (and may be completely ridiculous,
> is to make its availability dependent on a kernel option and have
> warning in NOTES about it contravening normal and accepted practice
> and that it can cause serious problems both for yourself and for
> others using the network.

Agreed. A sysctl as suggested by Grenville might be sufficient though.
Requiring a full kernel recompile seems a bit draconian.

> I might also note that almost all higher performance (1G and faster)
> networks already have a form of this...TSO. In case you hadn't
> noticed, TSO will take a large buffer and transmit it as multiple
> segments which are transmitted back to back with NO delay or awareness
> of congestion. I can confirm that even this limited case can and does
> sometimes result in packet loss when router queues are inadequate to
> handle the load.

You nailed it - took the words right off my finger tips. Sure, a flow's
cwnd can exceed the TSO max chunk size by an order of magnitude, but the
fact remains that we live in a bursty world already. As much as I
dislike TSO in its current incarnation, it exists for good reason. We
need to provide useful tools, thorough documentation and set sensible
defaults.

Cheers,
Lawrence


More information about the freebsd-net mailing list