Server sporadically sending unexpected RST to Client

Andre Oppermann andre at freebsd.org
Tue Aug 10 09:11:51 UTC 2010


On 09.08.2010 15:03, Seth Jeacopello wrote:
> Thanks for the quick reply Andre; we have some new information.
>
> First I took some time to review some of the tcpdumps per your
> recommendation and have not found /any/ reuse (with most dumps spanning
> approx. a one hour time frame and the problem occurring toward the end of
> the time frame).

OK.  I thought this to be the most likely source of the problem.

> The client system is another FreeBSD system (we are unsure of the version at
> this time).

If there is no port reuse then the client OS shouldn't matter.

> You may be correct about the syncache simply showing the symptoms; as we dug
> deeper we began looking at changes in netisr, in particular the direct
> dispatch policy modifications.  We've run some tests over the weekend and
> found something that seems to work for us.
>
> We've found that moving from 'Always Direct' to 'Hybrid' mode seems to
> resolve the issue for us without any noticed consequences (setting
> net.isr.direct_force=0).  Can anyone comment on this setting and let us know
> of any downsides or problems that may occur running in this mode?

I haven't worked on the netisr code but a quick glance suggest that running
in hybrid mode should be fine and not cause any further problems.

> We believe that this problem is also only isolated to one of our Server
> platforms (testing on our other platform is still on-going, though initial
> results look good).

OK.

> Both platforms are Intel based (current generation vs. last generation) with
> various differences, though the one that may be most relative is the change
> of the on-board NIC from being 'em' based to 'igb' based (that is the
> systems with the issue all have 'em' based NICs vs. 'igb' of the newer
> systems).  This could be red-herring as well, though I feel it's probably a
> good idea to include as much information as possible when troubleshooting.

It is unlikely that the network card or the driver has anything to do with it.

> Thank you for all of your help and I look forward to hearing any further
> thoughts on this issue.

Please try the attached patch so I get better information from syncache_socket
on the particular error that comes up.  Socket creation and PCB setup are very
complicated areas.

-- 
Andre

Index: tcp_syncache.c
===================================================================
--- tcp_syncache.c	(revision 211131)
+++ tcp_syncache.c	(working copy)
@@ -627,6 +627,7 @@
  	struct inpcb *inp = NULL;
  	struct socket *so;
  	struct tcpcb *tp;
+	int error = 0;
  	char *s;

  	INP_INFO_WLOCK_ASSERT(&V_tcbinfo);
@@ -675,7 +676,7 @@
  	}
  #endif
  	inp->inp_lport = sc->sc_inc.inc_lport;
-	if (in_pcbinshash(inp) != 0) {
+	if ((error = in_pcbinshash(inp)) != 0) {
  		/*
  		 * Undo the assignments above if we failed to
  		 * put the PCB on the hash lists.
@@ -687,6 +688,12 @@
  #endif
  			inp->inp_laddr.s_addr = INADDR_ANY;
  		inp->inp_lport = 0;
+		if ((s = tcp_log_addrs(&sc->sc_inc, NULL, NULL, NULL))) {
+			log(LOG_DEBUG, "%s; %s: in_pcbinshash failed "
+			    "with error %i\n",
+			    s, __func__, error);
+			free(s, M_TCPLOG);
+		}
  		goto abort;
  	}
  #ifdef IPSEC
@@ -721,9 +728,15 @@
  		laddr6 = inp->in6p_laddr;
  		if (IN6_IS_ADDR_UNSPECIFIED(&inp->in6p_laddr))
  			inp->in6p_laddr = sc->sc_inc.inc6_laddr;
-		if (in6_pcbconnect(inp, (struct sockaddr *)&sin6,
-		    thread0.td_ucred)) {
+		if ((error = in6_pcbconnect(inp, (struct sockaddr *)&sin6,
+		    thread0.td_ucred)) != 0) {
  			inp->in6p_laddr = laddr6;
+			if ((s = tcp_log_addrs(&sc->sc_inc, NULL, NULL, NULL))) {
+				log(LOG_DEBUG, "%s; %s: in6_pcbconnect failed "
+				    "with error %i\n",
+				    s, __func__, error);
+				free(s, M_TCPLOG);
+			}
  			goto abort;
  		}
  		/* Override flowlabel from in6_pcbconnect. */
@@ -750,9 +763,15 @@
  		laddr = inp->inp_laddr;
  		if (inp->inp_laddr.s_addr == INADDR_ANY)
  			inp->inp_laddr = sc->sc_inc.inc_laddr;
-		if (in_pcbconnect(inp, (struct sockaddr *)&sin,
-		    thread0.td_ucred)) {
+		if ((error = in_pcbconnect(inp, (struct sockaddr *)&sin,
+		    thread0.td_ucred)) != 0) {
  			inp->inp_laddr = laddr;
+			if ((s = tcp_log_addrs(&sc->sc_inc, NULL, NULL, NULL))) {
+				log(LOG_DEBUG, "%s; %s: in_pcbconnect failed "
+				    "with error %i\n",
+				    s, __func__, error);
+				free(s, M_TCPLOG);
+			}
  			goto abort;
  		}
  	}


More information about the freebsd-net mailing list