TCP Loopback Connections with the Same Src/Dest Port

Wed Jul 17 11:09:08 UTC 2013

Our system is based on FreeBSD 8.1.  In some tests, we were having
issues caused by connections of this form (more details below):

 TCP4      0      0      0/   0/   0    127.0.0.1.665   127.0.0.1.665
 FIN_WAIT_1
 TCP4      0      0      0/   0/   0    127.0.0.1.637   127.0.0.1.637
 FIN_WAIT_1
 TCP4      0      0      0/   0/   0    127.0.0.1.648   127.0.0.1.648
 FIN_WAIT_1

Some questions we had:

- Has anyone else ever seen these same src/dest address/port TCP
connections created?  Does anyone know of a legitimate reason why they
should be allowed?

- If there are no known use cases for this type of connection, does
anyone have more context/insight on the design here: should this type
of inpcb creation be prevented in the kernel or is it the
application's responsibility to ensure it never creates this type of
socket?

For those interested, more details of the issue seen follow.  The
connection seems to get stuck in swi_net sending and receiving pure
FIN/ACKs to itself:

#12 0xffffffff804372ce in ip_output (m=0xffffff0003ccf300,
opt=<optimized out>, ro=0xffffff8020c2b6a0, flags=0, imo=0x0,
inp=0xffffff0019933968) at ../../../../sys/netinet/ip_output.c
#13 0xffffffff804423dc in tcp_output (tp=0xffffff0019de2370) at
../../../../sys/netinet/tcp_output.c
#14 0xffffffff8043ef5d in tcp_do_segment (m=0xffffff0019af1200,
th=0x100200, so=0xffffff011ac59570, tp=0xffffff0019de2370,
drop_hdrlen=52, tlen=0, iptos=0 '\000', ti_locked=3) at
../../../../sys/netinet/tcp_input.c
#15 0xffffffff80440311 in tcp_input (m=0xffffff0019af1200,
off0=<optimized out>) at ../../../../sys/netinet/tcp_input.c
#16 0xffffffff8043530b in ip_input (m=0xffffff0019af1200) at
../../../../sys/netinet/ip_input.c
#17 0xffffffff8040889f in netisr_process_workstream_proto
(proto=<optimized out>, nwsp=<optimized out>) at
../../../../sys/net/netisr.c
#18 swi_net (arg=0xffffffff80f59800) at ../../../../sys/net/netisr.c

swi_net() just continues in this loop, ad nauseam:

 759         while ((bits = nwsp->nws_pendingbits) != 0) {
 760                 while ((prot = ffs(bits)) != 0) {
 761                         prot--;
 762                         bits &= ~(1 << prot);
 763                         (void)netisr_process_workstream_proto(nwsp, prot);
 764                 }
 765         }

The tcp_output() being triggered in tcp_do_segment() in the case is
the one show on line 2303 below:

2212         /*
2213          * In ESTABLISHED state: drop duplicate ACKs; ACK out of range
2214          * ACKs.  If the ack is in the range
2215          *      tp->snd_una < th->th_ack <= tp->snd_max
2216          * then advance tp->snd_una to th->th_ack and drop
2217          * data from the retransmission queue.  If this ACK reflects
2218          * more up to date window information we update our
window information.
2219          */
2220         case TCPS_ESTABLISHED:
2221         case TCPS_FIN_WAIT_1:
2222         case TCPS_FIN_WAIT_2:
2223         case TCPS_CLOSE_WAIT:
2224         case TCPS_CLOSING:
2225         case TCPS_LAST_ACK:
2226                 if (SEQ_GT(th->th_ack, tp->snd_max)) {
2227                         TCPSTAT_INC(tcps_rcvacktoomuch);
2228                         goto dropafterack;
2229                 }
...
2234                 if (SEQ_LEQ(th->th_ack, tp->snd_una)) {
...
2248                         if (tlen == 0 && tiwin == tp->snd_wnd) {
2249                                 TCPSTAT_INC(tcps_rcvdupack);
...
2277                                 if (!tcp_timer_active(tp, TT_REXMT) ||
2278                                     th->th_ack != tp->snd_una)
2279                                         tp->t_dupacks = 0;
2280                                 else if (++tp->t_dupacks >
tcprexmtthresh ||
2281                                     ((V_tcp_do_newreno ||
2282                                       (tp->t_flags & TF_SACK_PERMIT)) &&
2283                                      IN_FASTRECOVERY(tp))) {
2284                                         if ((tp->t_flags &
TF_SACK_PERMIT) &&
2285                                             IN_FASTRECOVERY(tp)) {
2286                                                 int awnd;
2287
2288                                                 /*
2289                                                  * Compute the
amount of data in flight first.
2290                                                  * We can inject
new data into the pipe iff
2291                                                  * we have less
than 1/2 the original window's
2292                                                  * worth of data in flight.
2293                                                  */
2294                                                 awnd =
(tp->snd_nxt - tp->snd_fack) +
2295
tp->sackhint.sack_bytes_rexmit;
2296                                                 if (awnd <
tp->snd_ssthresh) {
2297
tp->snd_cwnd += tp->t_maxseg;
2298                                                         if
(tp->snd_cwnd > tp->snd_ssthresh)
2299
tp->snd_cwnd = tp->snd_ssthresh;
2300                                                 }
2301                                         } else
2302                                                 tp->snd_cwnd +=
tp->t_maxseg;
2303                                         (void) tcp_output(tp);
2304                                         goto drop;

I've noticed that we don't yet have this patch in our code:

http://svnweb.freebsd.org/base?view=revision&revision=239672

Which seems like it could be relevant here to the general case of both
ends of the connection entering FIN_WAIT_1 at the same time and
sending FIN/ACKs repeatedly (though our connections are a bizarre case
of this where both ends of the connection are actually the same
connection).

Thanks,

Matt