The tale of a TCP bug

John Baldwin jhb at freebsd.org
Fri Mar 25 20:40:17 UTC 2011


On Friday, March 25, 2011 3:41:09 pm Stefan `Sec` Zehl wrote:
> Hi,
> 
> On Fri, Mar 25, 2011 at 08:25 -0400, John Baldwin wrote:
> > Ah, ok.  Can you try this patch first (without the other)?  If it doesn't
> > work then we can refine the patch above further.
> 
> I tried it completely unpatched and with your new patch. In both cases
> that if() statement is not taken. 
> 
> Instrumenting this part of the code with printf()s shows that recwin is
> 65536 right after your patched if, but reduced to 65535 by the next
> statment.
> 
> |  	if (recwin > (long)TCP_MAXWIN << tp->rcv_scale)
> |  		recwin = (long)TCP_MAXWIN << tp->rcv_scale;
> 
> That's the same effect as in the the affected adv calculation:
> 
> % long adv = min(recwin, (long)TCP_MAXWIN << tp->rcv_scale) -
> %       (tp->rcv_adv - tp->rcv_nxt);
> 
> recwin is 65535, but the min limits it to 65535.

Reading some more.  I'm trying to understand the breakage in your case.

You are saying that FreeBSD is the sender, who has data to send, yet is not 
sending any window probes because it never starts the persist timer when the 
initial window is zero?  Is that correct?

And the problem is that the code that uses 'adv' to determine if it sound send 
a window update to the remote end is falsely succeeding due to the overflow 
causing tcp_output() to 'goto send' but that it then fails to send any data 
because it thinks the remote window is full?

So one thing I don't quite follow is how you are having rcv_nxt > rcv_adv.  I 
saw this when the other side would send a window probe, and then the receiving 
side would take the -1 remaining window and explode it into the maximum window 
size when it ACKd.

Are you seeing the other end of the connection send a window probe, but 
FreeBSD is not setting the persist timer so that it will send its own window 
probes?

-- 
John Baldwin


More information about the freebsd-net mailing list