tcp.recvspace=56K causes dropped packets with V.90 PPP link

Tue Oct 14 13:54:22 PDT 2003

FreeBSD 4.8-R kernel.GENERIC
user ppp
P II - 400
128 MB RAM
56K ext. modem, 45.3K connection

The following is something I posted to freebsd-net about a week
ago.  Since then I have pinpointed the cause of the problem.
It's not a bug, it's a configuration problem.

>From running tcpdump on an ftp transfer I discovered that the
ftp client stalls on receive due to a dropped packet.  The
dropped packet is not retransmitted until the socket receive
buffer has filled up with packets following the dropped packet.
Until the dropped packet is retransmitted the next chunk of data
is not available to the ftp client.

The cause of the dropped packet is a combination of a slow link
(5 KB/s V.90 PPP link) and a large TCP receive window.  At 56 KB
the window is sufficiently large to allow the sender to more
than fill the queue of some router in the path (probably the
router at my ISP's POP).

Keep in mind that this dropped-packet situation is 100%
repeatable, with the same packet getting dropped.  It's not just
Internet congestion somwehere.

Upon closer inspection I've also seen dropped packets with a 48
KB receive window.  32 KB doesn't give me dropped packets,
though.

One moral of the story is that there is such a thing as a
too-large TCP receive window, and for me >= 48 KB is too large.
Another moral of the story is that on a machine with both very
fast and very slow network interfaces there may be no single
receive window size that is optimal for all interfaces.

P.S. I struck out trying to set a smaller-than-default size for
the receive window of the default route (my PPP link) due to a
bug in tcp_input.c that was introduced when PR 11966 was fixed.
The new bug does not allow for shrinking the send/receive window
sizes beneath their current value, even if that current value
was NOT set by an application (i.e., it's the system-wide
default value).  But that's another problem.

http://www.freebsd.org/cgi/query-pr.cgi?pr=11966

------------------------------------------------------------------

Something in the socket/proto/network interface area doesn't
work correctly when tcp.recvspace=56K, the default value in 4.8-R.
It DOES work correctly when tcp.recvspace= 16K, 32K, 48K.

I see the following repeatable problem.  At the same point in
the ftp reception of a 218K .gz file, received data stops getting
delivered to the ftp client and starts stacking up in mbufs.
The ftp client reports "stalled".  When the recvspace limit
is reached, the entire socket receive buffer is delivered to
the ftp client as fast as the client can take it.  For the
remainder of the file transfer there are no further ftp
client stalls.  Note that the stall occurs at approx.
2 * tcp.recvspace.

Please note that this stall would not be perceptible on a
LAN: the 56K socket receive buffer would fill up too quickly.

This would be mostly harmless were it not for the extra mbufs
being consumed.  On a system with many TCP connections the
supply of mbufs might be exhausted.

Following is a log of my activities in trying to track
down this problem.  I haven't been able to pinpoint it.
Does anyone have any idea what it might be or how to
further track it down?

10/05/03

Start: tcp.recvspace=48K
ftp receive 218K .gz file
no ftp client stalls
netstat -m
191/224/6016 mbufs in use (current/peak/max):
        189 mbufs allocated to data
        2 mbufs allocated to packet headers
130/146/1504 mbuf clusters in use (current/peak/max)
348 Kbytes allocated to network (7% of mb_map in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

Change tcp.recvspace=56K (default)
ftp receive 218K .gz file
ftp client stalls @ 113K for ~10 seconds, then jumps to 170K
ftp client reports "stalled"
no modem RxD stall
netstat -w 2 -I tun0 shows no stall in tun0 input.
netstat -m
192/464/6016 mbufs in use (current/peak/max):
        190 mbufs allocated to data
        2 mbufs allocated to packet headers
130/146/1504 mbuf clusters in use (current/peak/max)
408 Kbytes allocated to network (9% of mb_map in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

Consumed 464-224=240 extra mbufs, no extra mbuf clusters
408-348=60 KB additional RAM allocated to network
Note: MSIZE=256 (machine/param.h)
Note: No stall w/ tcp.recvspace = 16K, 32K, 48K.

Tried ftp -d: echoed all commands sent to host, but
didn't appear to produce any socket/proto debug output.