Removing T/TCP and replacing it with something simpler

Andre Oppermann andre at freebsd.org
Fri Oct 22 08:14:09 PDT 2004


Sean Chittenden wrote:
> 
> >>> However something like T/TCP is certainly useful and I know of one
> >>> special
> >>> purpose application using it (Web Proxy Server/Client for high-delay
> >>> Satellite
> >>> connections).
> >>
> >> Actually, there are two/three programs that I know of that use it.
> >> memcached(1), which found a fantastic decrease in its benchmarks.
> >> Here's an excerpt from the following link:
> >>
> >> http://lists.danga.com/pipermail/memcached/2003-August/000111.html
> >
> > I think you got something wrong here.  T/TCP is never ever mentioned
> > in this.  Memcached is not using T/TCP as far as I can see.
> 
> It's not, but I thought setsockopt(2) w/ TCP_NOPUSH enabled the use of
> T/TCP in that there was no 3WHS performed on a TCP_NOPUSH'ed
> connection.

No, it is not.  T/TCP will only be used if you use sendto(), have T/TCP
globally enabled on the machine and the server supports it too.

TCP_NOPUSH was introduced together with or some time after T/TCP to
change the behaviour how tcp_output() pushes non-full packets on the
wire.  It pretty closely related to the same purpose as TCP_CORK.

> >> and an internal reverse proxy server/modified apache that I've hacked
> >> together (reduces latency in a tiered request hierarchy a great deal,
> >> on order of the benchmarks from above).
> >
> > What syscall do you use to get to the other side in your reverse proxy?
> 
> On the client, sendto()/read().  On the server, setsockopt() + write().

Ok, then you are indeed using T/TCP (provided you have enabled it on
both machines).  The setsockopt() optimizes packet sending on the server
but otherwise doesn't have anything to do with T/TCP.

> > I'm not sure if I can follow you here.  TCP_CORK deals with the
> > different
> > behaviour of connections with Nagle vs. TCP_NODELAY.  TCP_CORK allows
> > to
> > avoid the delays of Nagle by corking (sort of blocking) the sending of
> > packets until you are done with write()ing to the socket.  Then the
> > connection is uncorked and all data will be sent in one go even if it
> > doesn't fill an entire packet.  Sort of an fsync() for sockets.  There
> > are no security implications with TCP_CORK as far as I am aware.
> 
> Isn't that what NOPUSH does?  Or is it that CORK uses a fully
> established TCP connection, but blocks sending data until the
> connection has been uncorked/flushed?  I thought that TCP_CORK had the
> same security implications that NOPUSH does (ie, the lack of a hand
> shake).

None of it.  Neither NOPUSH nor CORK have any security implications.
Those are only with the specification of T/TCP.  Blocking the data
is independend of 3WSH.  Normally you have Nagle enabled (default)
and when you don't fill an entire packet worth of data it will wait
up to 200ms to send the packet in anticipation of more data from the
socket.  This screws the responsiveness of your connection.  The first
solution is to turn off Nagle (with TCP_NODELAY) but now you get a
packet for every single write() you do.  Fine for telnet and ssh but
not the right thing for a database server.  There you don't want the
delay but at the same time you want several successive write()s that
will go in one packet on the wire.  Here NOPUSH and CORK come into
play.

> I was under the impression that by default NOPUSH would prevent a
> connect() until there was a full packet to be sent or the socket had
> been closed/flushed.  The first/only packet from the client to the
> server would contain a SIN+PUSH+FIN + the data for the request, then
> the server would come back with a SIN+PUSH+FIN+ACK.  Essentially UDP,
> but with checksums and packet retransmission built in.

More or less correct.  However the SYN+FIN+Data is caused by T/TCP
and not NOPUSH.  NOPUSH is used as an optimization as I have described
above usually on the server side.

-- 
Andre


More information about the freebsd-arch mailing list