Some questions about the new TCP congestion control code
John Baldwin
jhb at freebsd.org
Mon Jan 14 21:04:35 UTC 2013
I was looking at TCP congestion control at work recently and noticed a few
"odd" things in the current code. First, there is this chunk of code in
cc_ack_received() in tcp_input.c:
static void inline
cc_ack_received(struct tcpcb *tp, struct tcphdr *th, uint16_t type)
{
INP_WLOCK_ASSERT(tp->t_inpcb);
tp->ccv->bytes_this_ack = BYTES_THIS_ACK(tp, th);
if (tp->snd_cwnd == min(tp->snd_cwnd, tp->snd_wnd))
tp->ccv->flags |= CCF_CWND_LIMITED;
else
tp->ccv->flags &= ~CCF_CWND_LIMITED;
Due to hysterical raisins, snd_cwnd and snd_wnd are u_long values, not
integers, so the call to min() results in truncation on 64-bit hosts. It
should probably be ulmin() instead. However, this line seems to be a really
obfuscated way to just write:
if (tp->snd_cwnd <= tp->snd_wnd)
If that is correct, I would vote for changing this to use the much simpler
logic.
Secondly, in the particular case I was investigating at work (restart of an
idle connnection), the newreno congestion control code in 8.x and later uses a
different algorithm than in 7. Specifically, in 7 TCP would reuse the same
logic used for an initial cwnd (honoring ss_fltsz). In 8 this no longer
happens (instead, 2 is hardcoded). A guess at a possible fix might look
something like this:
Index: cc_newreno.c
===================================================================
--- cc_newreno.c (revision 243660)
+++ cc_newreno.c (working copy)
@@ -169,8 +169,21 @@ newreno_after_idle(struct cc_var *ccv)
if (V_tcp_do_rfc3390)
rw = min(4 * CCV(ccv, t_maxseg),
max(2 * CCV(ccv, t_maxseg), 4380));
+#if 1
else
rw = CCV(ccv, t_maxseg) * 2;
+#else
+ /* XXX: This is missing a lot of stuff that used to be in 7. */
+#ifdef INET6
+ else if ((isipv6 ? in6_localaddr(&CCV(ccv, t_inpcb->in6p_faddr)) :
+ in_localaddr(CCV(ccv, t_inpcb->inp_faddr))))
+#else
+ else if (in_localaddr(CCV(ccv, t_inpcb->inp_faddr)))
+#endif
+ rw = V_ss_fltsz_local * CCV(ccv, t_maxseg);
+ else
+ rw = V_ss_fltsz * CCV(ccv, t_maxseg);
+#endif
CCV(ccv, snd_cwnd) = min(rw, CCV(ccv, snd_cwnd));
}
(But using the #else clause instead of the current #if 1 code). Was this
change in 8 intentional? If not, I would suggest that a real fix would be to
export a function that includes the logic to compute the initial cwnd and
share that between cc_conn_init() in tcp_input.c and cc_newreno.c. (Yes, I
realize that ss_fltsz and friends are gone in 10, but if this was a bug then
the fix I suggested above of using a common function could be applied to 8 to
restore that functionality if desired. The bigger point is to make sure what
we are doing is correct as I'm not sure.)
--
John Baldwin
More information about the freebsd-net
mailing list