[PATCH] Add a new TCP_IGNOREIDLE socket option

Lawrence Stewart lstewart at freebsd.org
Wed Feb 13 08:32:43 UTC 2013


FYI I've read the whole thread as of this reply and plan to follow up to
a few of the other posts separately, but first for my initial thoughts...

On 01/23/13 07:11, John Baldwin wrote:
> As I mentioned in an earlier thread, I recently had to debug an issue we were 
> seeing across a link with a high bandwidth-delay product (both high bandwidth 
> and high RTT).  Our specific use case was to use a TCP connection to reliably 
> forward a latency-sensitive datagram stream across a WAN connection.  We would 
> often see spikes in the latency of individual datagrams.  I eventually tracked 
> this down to the connection entering slow start when it would transmit data 
> after being idle.  The data stream was quite bursty and would often attempt to 
> transmit a burst of data after being idle for far longer than a retransmit 
> timeout.

Got it.

> In 7.x we had worked around this in the past by disabling RFC 3390 and jacking 
> the slow start window size up via a sysctl.  On 8.x this no longer worked.

I can't think of, nor have I read any convincing argument why we
shouldn't support your use case out of the box. You're not the only user
of FreeBSD over dedicated lines who knows what you're doing. We should
provide some way to support this use case.

We're therefore left with the question of how to implement this.

As noted in the "Some questions about the new TCP congestion control
code" thread [1], it was always my intention to axe the ss_flightsize
variables and replace them with a better mechanism. Andre swung the axe
before I did and 10.x is looming so it's a good time to discuss all of this.

> The solution I came up with was to add a new socket option to disable idle 
> handling completely.  That is, when an idle connection restarts with this new 
> option enabled, it keeps its current congestion window and doesn't enter slow 
> start.

rwatson@ mentioned an idea in private discussion which I've also thought
about over the years. The real goal here should be to subsume your use
case (and others) into a much richer framework for hinting desired
behaviour/tradeoff preferences (some aspects of which relate to parts of
my PhD work, which will hopefully be coming to a kernel near you in 2013 ;).

My main concern with your patch is that I'm a bit uneasy about
enshrining a socket option in a public API and documentation that is so
specific. I suspect apps probably want to set higher level goals like
"low latency *at any cost*" and have the stack opaquely interpret that
as "this guy is willing to blow his foot off, so let's disable idle
window reset, tweak X, disable Y and hand the man his loaded shotgun".
TCP_IGNOREIDLE as currently proposed misses this bigger picture, though
doesn't preclude it either.

I would also echo Kevin/Grenville's thoughts about keying the socket
option's activation off a tunable (sysctl or kernel option is up for
discussion, though I'd be leaning towards sysctl) that is disabled by
default i.e. only skip after idle window reset if the app sets the
option *and* the sysadmin has pulled the "I like me some bursty network"
lever.

> There are only a few cases where such an option is useful, but if anyone else 
> thinks this might be useful I'd be happy to add the option to FreeBSD.

The idea is useful. I'd just like to discuss the implementation
specifics a little further before recommending whether the patch should
go in as is to provide a stop gap, or we rework the patch to be a little
less specific in readiness for the future work I have in mind.

Cheers,
Lawrence

[1] http://lists.freebsd.org/pipermail/freebsd-net/2013-January/034297.html


More information about the freebsd-net mailing list