A small TCP bug: excessive duplicate ACKs

John Baldwin jhb at freebsd.org
Tue Feb 8 17:02:03 UTC 2011


One thing I've noticed at work is that if a receiver's socket buffer fills and 
the receiver then drains the buffer all at once, we send a lot of duplicate 
ACKs.  I narrowed this down to being due to the abnormally high window scaling 
factor we have.  We set kern.ipc.maxsockbuf to 314572800 which results in a 
window scaling factor of 8k.  This interacts poorly with the logic that 
decides whether or not to force a window update in tcp_output():

        /*
         * Compare available window to amount of window
         * known to peer (as advertised window less
         * next expected input).  If the difference is at least two
         * max size segments, or at least 50% of the maximum possible
         * window, then want to send a window update to peer.
         * Skip this if the connection is in T/TCP half-open state.
         * Don't send pure window updates when the peer has closed
         * the connection and won't ever send more data.
         */
        if (recwin > 0 && !(tp->t_flags & TF_NEEDSYN) &&
            !TCPS_HAVERCVDFIN(tp->t_state)) {
                /*
                 * "adv" is the amount we can increase the window,
                 * taking into account that we are limited by
                 * TCP_MAXWIN << tp->rcv_scale.
                 */
                long adv = min(recwin, (long)TCP_MAXWIN << tp->rcv_scale) -
                        (tp->rcv_adv - tp->rcv_nxt);

                if (adv >= (long) (2 * tp->t_maxseg))
                        goto send;
                if (2 * adv >= (long) so->so_rcv.sb_hiwat)
                        goto send;
        }

Specifically, we can send a duplicate ACK when (2 * tp->t_maxseg) or
(so->so_rcv.sb_hiwat / 2) are less than the window scaling factor.  I have a 
test app that you can run against a TCP chargen service from inetd to 
reproduce it.  I also have two TCP dumps from before and after.  The patch I'm 
using to fix this is below (I could rework it to not use the extra goto 
perhaps, but went with a simple hack to minimize reindenting for now):

Index: tcp_output.c
===================================================================
--- tcp_output.c        (revision 217650)
+++ tcp_output.c        (working copy)
@@ -560,11 +560,19 @@
                long adv = min(recwin, (long)TCP_MAXWIN << tp->rcv_scale) -
                        (tp->rcv_adv - tp->rcv_nxt);
 
+               /* 
+                * If the new window size ends up being the same as the old
+                * size when it is scaled, then don't force a window update.
+                */
+               if ((tp->rcv_adv - tp->rcv_nxt) >> tp->rcv_scale ==
+                   (adv + tp->rcv_adv - tp->rcv_nxt) >> tp->rcv_scale)
+                       goto dontupdate;
                if (adv >= (long) (2 * tp->t_maxseg))
                        goto send;
                if (2 * adv >= (long) so->so_rcv.sb_hiwat)
                        goto send;
        }
+dontupdate:
 
        /*
         * Send if we owe the peer an ACK, RST, SYN, or urgent data.  ACKNOW

Note that if the ACK sequence number has moved then I think other checks in 
tcp_output() will still force an ACK packet out, so I don't think this will 
cause us to miss on sending ACKs to the peers.

You can find the test app source (tcpslow.c) and the dumps at 
http://people.freebsd.org/~jhb/tcpslow/

If you look at tcp_bad.out, the receiver stops reading data the receiver's 
socket buffer fills up around packet 72 or so.  The receiver wakes up at 
packet 88 and drains the buffer causing a small storm of window updates.  
However, due to the scaling factor, it actually sends duplicate ACKs in 
batches of threes (3 ACKs for 8k window, 3 ACKs for 16k window, etc.).  This 
happens each time the receiver wakes up and drains a full socket buffer.  The 
tcp_good.out dump shows the stream with the patch applied.  A similar event of 
the receiver draining a full buffer starts at packet 83 and it sends a single 
ACK for each "real" window update.

-- 
John Baldwin


More information about the freebsd-net mailing list