TCP and syncache question

Andre Oppermann andre at freebsd.org
Thu Nov 20 04:51:38 PST 2008


Harti Brandt wrote:
> Hi Andre,
> 
> On Mon, 17 Nov 2008, Andre Oppermann wrote:
> 
> AO>This is a bit more complicated because of interactions with tcp_input()
> AO>where syncache_expand() is called from.
> AO>
> AO>The old code (as of December 2002) behaved slightly different.  It would
> AO>not remove the syncache entry when (SND.UNA == SEG.ACK) but send a RST.
> AO>The (RCV.NXT =< SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND) test wasn't done at
> AO>all.  Instead a socket was opened whenever (SND.UNA == SEG.ACK) succeeded.
> AO>This gave way to the "LAND" DoS attack which was mostly fixed with a test
> AO>for (RCV.IRS < SEG.SEQ).
> AO>
> AO>See the attached patch for fixed version of syncache_expand().  This patch
> AO>is untested though.  My development machine is currently down.  Harti, Rui
> AO>and Bjoern, please have a look at the patch and review it.
> 
> Some small problems:
...
> Need another cast here: *lsop = (struct socket *)1.

Changed the logic to use a NULL *lsop to differentiate in tcp_input().
Much simpler.

> [snip]
> 
> I've re-run my test scripts and they seem to indicate, that the socket is
> now kept in the correct state when the incoming segment had an incorrect
> ack. I could no yet run the tests for an incorrect seqno, though. This is
> because there is an interesting problem in RFC793 (and MIL-STD-1778): the RFC
> states on page 36 a general rule that a 'reset (RST) must be sent whenever
> a segment arrives which apparently is not intended for the current connection.'

The full quote from page 36 is: "As a general rule, reset (RST) must be sent
whenever a segment arrives which apparently is not intended for the current
connection.  A reset must not be sent if it is not clear that this is the case."

> I would say, that a segment carrying a sequence number at irs (when without SYN)
> and below irs (in any case) cannot belong to the current connection - those
> sequence numbers just don't exist for the connection. On the other hand p.69
> says that we must send an ACK. If this were an ITU-T standard, things
> would be clear, because the prosa description would be normative, not the
> algorithm.

The more specific rule wins.  Sending the "challenge" ACK is done under the
assumption that the remote end will send a reset to our "challenge" ACK if
such an connection doesn't exist there.

 > I've tried to follow the classical BSD code in Stevens and
> it seems that before syncaches and stuff, an ACK was sent for a bad
> sequence number. So I'll change my tests (probably tomorrow in the evening)
> and check that the patch is correct for this case too. At a first glance
> a nice SYN+ACK came out...

An updated patch is attached.

-- 
Andre
-------------- next part --------------
--- tcp_input.c.1.390	Mon Nov 17 21:33:25 2008
+++ tcp_input.c.1.390.mod	Thu Nov 20 08:25:36 2008
@@ -642,10 +642,13 @@ findpcb:
 			if (!syncache_expand(&inc, &to, th, &so, m)) {
 				/*
 				 * No syncache entry or ACK was not
-				 * for our SYN/ACK.  Send a RST.
+				 * for our SYN/ACK.  Send a RST or
+				 * an ACK for re-synchronization.
 				 * NB: syncache did its own logging
 				 * of the failure cause.
 				 */
+				if (so == NULL)
+					goto dropunlock;
 				rstreason = BANDLIM_RST_OPENPORT;
 				goto dropwithreset;
 			}
--- tcp_syncache.c.1.160	Mon Nov 17 16:49:01 2008
+++ tcp_syncache.c.1.160.mod	Thu Nov 20 12:13:26 2008
@@ -823,53 +823,39 @@ syncache_expand(struct in_conninfo *inc,
 	if (sc == NULL) {
 		/*
 		 * There is no syncache entry, so see if this ACK is
-		 * a returning syncookie.  To do this, first:
-		 *  A. See if this socket has had a syncache entry dropped in
-		 *     the past.  We don't want to accept a bogus syncookie
-		 *     if we've never received a SYN.
-		 *  B. check that the syncookie is valid.  If it is, then
-		 *     cobble up a fake syncache entry, and return.
+		 * a returning syncookie.  If the syncookie is valid,
+		 * cobble up a fake syncache entry and create a socket.
+		 *
+		 * NB: Syncache head is locked for the syncookie access.
 		 */
 		if (!tcp_syncookies) {
-			SCH_UNLOCK(sch);
 			if ((s = tcp_log_addrs(inc, th, NULL, NULL)))
 				log(LOG_DEBUG, "%s; %s: Spurious ACK, "
 				    "segment rejected (syncookies disabled)\n",
 				    s, __func__);
-			goto failed;
+			goto sendrst;
 		}
 		bzero(&scs, sizeof(scs));
 		sc = syncookie_lookup(inc, sch, &scs, to, th, *lsop);
-		SCH_UNLOCK(sch);
 		if (sc == NULL) {
 			if ((s = tcp_log_addrs(inc, th, NULL, NULL)))
 				log(LOG_DEBUG, "%s; %s: Segment failed "
 				    "SYNCOOKIE authentication, segment rejected "
 				    "(probably spoofed)\n", s, __func__);
-			goto failed;
+			goto sendrst;
 		}
-	} else {
-		/* Pull out the entry to unlock the bucket row. */
-		TAILQ_REMOVE(&sch->sch_bucket, sc, sc_hash);
-		sch->sch_length--;
-		V_tcp_syncache.cache_count--;
-		SCH_UNLOCK(sch);
+		goto expand;		/* fully validated through syncookie */
 	}
 
 	/*
 	 * Segment validation:
-	 * ACK must match our initial sequence number + 1 (the SYN|ACK).
-	 */
-	if (th->th_ack != sc->sc_iss + 1 && !TOEPCB_ISSET(sc)) {
-		if ((s = tcp_log_addrs(inc, th, NULL, NULL)))
-			log(LOG_DEBUG, "%s; %s: ACK %u != ISS+1 %u, segment "
-			    "rejected\n", s, __func__, th->th_ack, sc->sc_iss);
-		goto failed;
-	}
-
-	/*
+	 *
 	 * The SEQ must fall in the window starting at the received
 	 * initial receive sequence number + 1 (the SYN).
+	 * If not the segment may be from an earlier connection.
+	 * We send an ACK to re-synchronize the connection and keep
+	 * the syncache entry without ajusting its timers.
+	 * See RFC793 page 69, first check sequence number [SYN_RECEIVED].
 	 */
 	if ((SEQ_LEQ(th->th_seq, sc->sc_irs) ||
 	    SEQ_GT(th->th_seq, sc->sc_irs + sc->sc_wnd)) &&
@@ -877,14 +863,41 @@ syncache_expand(struct in_conninfo *inc,
 		if ((s = tcp_log_addrs(inc, th, NULL, NULL)))
 			log(LOG_DEBUG, "%s; %s: SEQ %u != IRS+1 %u, segment "
 			    "rejected\n", s, __func__, th->th_seq, sc->sc_irs);
-		goto failed;
+		(void) syncache_respond(sc);
+		*lsop = NULL;			/* prevent RST */
+		goto sendrstkeep;
+	}
+
+	/*
+	 * ACK must match our initial sequence number + 1 (the SYN|ACK).
+	 * If not the segment may be from an earlier connection.  We send
+	 * a RST but keep the syncache entry without ajusting its timers.
+	 * See RFC793 page 72, fifth check the ACK field, [SYN_RECEIVED].
+	 */
+	if (th->th_ack != sc->sc_iss + 1 && !TOEPCB_ISSET(sc)) {
+		if ((s = tcp_log_addrs(inc, th, NULL, NULL)))
+			log(LOG_DEBUG, "%s; %s: ACK %u != ISS+1 %u, segment "
+			    "rejected\n", s, __func__, th->th_ack, sc->sc_iss);
+		goto sendrstkeep;
 	}
 
+	/*
+	 * Remove the entry to unlock the bucket row.
+	 * Tests from now on are fatal and remove the syncache entry.
+	 */
+	TAILQ_REMOVE(&sch->sch_bucket, sc, sc_hash);
+	sch->sch_length--;
+	V_tcp_syncache.cache_count--;
+
+	/*
+	 * If timestamps were not negotiated they must not show up later.
+	 * See RFC1312bis, section 1.3, second paragraph 
+	 */
 	if (!(sc->sc_flags & SCF_TIMESTAMP) && (to->to_flags & TOF_TS)) {
 		if ((s = tcp_log_addrs(inc, th, NULL, NULL)))
 			log(LOG_DEBUG, "%s; %s: Timestamp not expected, "
 			    "segment rejected\n", s, __func__);
-		goto failed;
+		goto sendrst;
 	}
 	/*
 	 * If timestamps were negotiated the reflected timestamp
@@ -896,9 +909,11 @@ syncache_expand(struct in_conninfo *inc,
 			log(LOG_DEBUG, "%s; %s: TSECR %u != TS %u, "
 			    "segment rejected\n",
 			    s, __func__, to->to_tsecr, sc->sc_ts);
-		goto failed;
+		goto sendrst;
 	}
 
+expand:
+	SCH_UNLOCK(sch);
 	*lsop = syncache_socket(sc, *lsop, m);
 
 	if (*lsop == NULL)
@@ -906,16 +921,18 @@ syncache_expand(struct in_conninfo *inc,
 	else
 		V_tcpstat.tcps_sc_completed++;
 
-/* how do we find the inp for the new socket? */
 	if (sc != &scs)
 		syncache_free(sc);
 	return (1);
-failed:
-	if (sc != NULL && sc != &scs)
+
+sendrst:
+	if (sc != &scs)
 		syncache_free(sc);
+sendrstkeep:
+	SCH_LOCK_ASSERT(sch);
+	SCH_UNLOCK(sch);
 	if (s != NULL)
 		free(s, M_TCPLOG);
-	*lsop = NULL;
 	return (0);
 }
 


More information about the freebsd-net mailing list