The tale of a TCP bug

J. Hellenthal jhell at DataIX.net
Mon Mar 28 01:39:28 UTC 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



On Sat, 26 Mar 2011 18:43, sec@ wrote:
> Hi,
>
>> On Fri, Mar 25, 2011 at 16:40 -0400, John Baldwin wrote:
>>> And the problem is that the code that uses 'adv' to determine if it
>>> sound send a window update to the remote end is falsely succeeding due
>>> to the overflow causing tcp_output() to 'goto send' but that it then
>>> fails to send any data because it thinks the remote window is full?
>
> On a whim I wanted to find out, how often that overflow is triggered in
> normal operation, and whipped up a quick counter-sysctl.
>
> --- sys/netinet/tcp_output.c.org	2011-01-04 19:27:00.000000000 +0100
> +++ sys/netinet/tcp_output.c	2011-03-26 18:49:30.000000000 +0100
> @@ -87,6 +87,11 @@
> extern struct mbuf *m_copypack();
> #endif
>
> +VNET_DEFINE(int, adv_neg) = 0;
> +SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, adv_neg, CTLFLAG_RD,
> +   &VNET_NAME(adv_neg), 1,
> +   "How many times adv got negative");
> +
> VNET_DEFINE(int, path_mtu_discovery) = 1;
> SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, path_mtu_discovery, CTLFLAG_RW,
> 	&VNET_NAME(path_mtu_discovery), 1,
> @@ -573,6 +578,10 @@
> 		long adv = min(recwin, (long)TCP_MAXWIN << tp->rcv_scale) -
> 			(tp->rcv_adv - tp->rcv_nxt);
>
> +		if(min(recwin, (long)TCP_MAXWIN << tp->rcv_scale) <
> +				(tp->rcv_adv - tp->rcv_nxt))
> +			adv_neg++;
> +
> 		if (adv >= (long) (2 * tp->t_maxseg))
> 			goto send;
> 		if (2 * adv >= (long) so->so_rcv.sb_hiwat)
>
> I booted my main (web/shell) box with (only) this patch:
>
> 11:36PM  up  3:50, 1 user, load averages: 2.29, 1.51, 0.73
> net.inet.tcp.adv_neg: 2466
>
> That's approximately once every 5 seconds. That's way more often than I
> suspected.
>
> CU,
>    Sec
>

With this patch applied with John's on a 32-bit box I can repeatedly bump 
this sysctl with an SSL connection to another destination. Doesn't seem to 
matter what the destination is.

curl -q https://www.changeip.com/ip.asp

It also bumps in SSL connections to other protocols too.

This behavior does not seem to be happening with non-SSL connections.

Attached is a script that I am using to monitor the sysctl here just for 
reference.

L = Last value
C = Current value
D = Difference
I = Log interval
S = Seconds since last change
* = marked changed line

/bin/sh ./adv_neg_mon.sh 7 |tee -a adv_neg.log
[...]
L:41 C:41 D:0 I:7 S:7.000000e+01
L:41 C:41 D:0 I:7 S:7.700000e+01
L:41 C:43 D:2 I:7 S:8.400000e+01 *
L:43 C:88 D:45 I:7 S:7.000000e+00 *


- -- 

  Regards,

  J. Hellenthal
  (0x89D8547E)
  JJH48-ARIN

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (FreeBSD)
Comment: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x89D8547E

iQEcBAEBAgAGBQJNj+a7AAoJEJBXh4mJ2FR+VssIAI7QSUUb6jvZdMWxxVGPpr6o
vPGDqPfWxNcih4D5SZxJJtsslnunpAcOjSWK8YGvOCINt8XhexVOSklyHuyvjIWd
4ijywngx5H2RT22c6wTdNPOfsZzoBkvLZZ2mj2cUF1ISxrvgy5syMp/TnANE3kul
Mqf29HA8t3qYQCfb6zuFoWGdYI5Ahfsks4rljZJy/5bRQfNceJwBjUGnSlL0651m
Bl4GpcNWA0fbuJeUgEzIK6mOpNdoI+PrZv6GEG7LErLaVtr+43gET/YITuGv1jY3
dlQ1WkHZSnaG/S7vpWbb2W/cuJ8ak6esbM74x8KakiOnLeJgy0MYK8oqYJyN3aI=
=l+iW
-----END PGP SIGNATURE-----
-------------- next part --------------
#!/bin/sh

trap 'exit 1' 2

UPDATE=$1 ;: ${UPDATE:=5}

while true; do
	NVAL=$(sysctl -n net.inet.tcp.adv_neg)
	if [ -z "$LVAL" ]; then
		LVAL=${NVAL}
	fi
	if [ "$NVAL" -gt "$LVAL" ]; then
		echo "L:$LVAL C:$NVAL D:$((${NVAL}-${LVAL})) I:${UPDATE} S:$(printf %e ${USECS}) *"
		USECS=${UPDATE}
	else
		echo "L:$LVAL C:$NVAL D:$((${NVAL}-${LVAL})) I:${UPDATE} S:$(printf %e ${USECS})"
		USECS=$((${USECS}+${UPDATE}))
	fi
	LVAL=${NVAL}
	sleep $UPDATE
done


More information about the freebsd-net mailing list