The tale of a TCP bug
    John Baldwin 
    jhb at freebsd.org
       
    Wed Mar 30 12:38:25 UTC 2011
    
    
  
On Monday, March 28, 2011 2:38:10 pm Stefan `Sec` Zehl wrote:
> Hi,
> 
> On Mon, Mar 28, 2011 at 14:23 -0400, John Baldwin wrote:
> > 
> > No, this is not really right.  Your patch from your blog is the best
> > fix actually.  The reason we want to let 'win' be larger than
> > TCP_MAXWIN is that if the remote end sends more data than we've
> > advertised but we have room in the socket buffer, we want to go ahead
> > and accept the data as valid and ACK it rather than dropping the data
> > that is beyond rcv_adv.  My change above to rcv_wnd would break this.
> > Also, for the TCPS_SYN_SENT case we don't know what 'rcv_scale' is
> > until just before we update 'rcv_adv'.  This should be the same 
> > as your patch:
> > 
> > Index: tcp_input.c
> > ===================================================================
> > --- tcp_input.c	(revision 220098)
> > +++ tcp_input.c	(working copy)
> > @@ -1756,7 +1756,8 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th,
> >  				(TF_RCVD_SCALE|TF_REQ_SCALE)) {
> >  				tp->rcv_scale = tp->request_r_scale;
> >  			}
> > -			tp->rcv_adv += tp->rcv_wnd;
> > +			tp->rcv_adv += imin(tp->rcv_wnd,
> > +			    TCP_MAXWIN << tp->rcv_scale);
> >  			tp->snd_una++;		/* SYN is acked */
> >  			/*
> >  			 * If there's data, delay ACK; if there's also a FIN
> > 
> 
> I've applied this to my test-VM, and as expected it now passes my two
> testcases. As far as I'm concerned this fixes it for me.
> 
> I'm interested to see if my adv_neg counting hack together with this
> patch still registers any hits. -- If nobody beats me to it, I'll try it
> out on my webserver tomorrow.
There is at least one case I know of related to a bug I reported earlier
where a window probe from a remote connection can cause rcv_nxt to advance
past rcv_adv by one.  However, I think we want to know about those cases,
and we should probably be treating rcv_adv - rcv_nxt as if it is zero in 
that case, not -1 (my patch in my original e-mail does just that in a
different place in tcp_output() when we calculate the window "for real").
-- 
John Baldwin
    
    
More information about the freebsd-net
mailing list