kern/176446: Concurrency in ixgbe driving out-of-order packet process and spurious RST

Julien Charbon jcharbon at verisign.com
Tue Feb 26 19:20:01 UTC 2013


>Number:         176446
>Category:       kern
>Synopsis:       Concurrency in ixgbe driving out-of-order packet process and spurious RST
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Feb 26 19:20:00 UTC 2013
>Closed-Date:
>Last-Modified:
>Originator:     Julien Charbon
>Release:        FreeBSD 8.3
>Organization:
Verisign
>Environment:
FreeBSD atlas-dl360-3 8.3-RELEASE FreeBSD 8.3-RELEASE #14 r50M: Tue Feb 26 15:13:53 UTC 2013     jcharbon at atlas-dl360-3:/usr/obj/usr/src/jcharbon/svn/freebsd-src-8.3.4/sys/GENERIC  amd64
>Description:
Under TCP network load using the ixgbe driver, we found an unexpected TCP behaviour:

15:26:45.129164 IP 192.168.100.23.30222 > 192.168.100.144.8080: Flags [S], seq 2020028671, win 14600, options [mss 1460,sackOK,TS val 1622018533 ecr 0,nop,wscale 7], length 0
15:26:45.130844 IP 192.168.100.144.8080 > 192.168.100.23.30222: Flags [S.], seq 1114608110, ack 2020028672, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 2170246269 ecr 1622018533], length 0
15:26:47.007276 IP 192.168.100.23.30222 > 192.168.100.144.8080: Flags [.], ack 1, win 115, options [nop,nop,TS val 1622020597 ecr 2170246269], length 0
15:26:47.007283 IP 192.168.100.23.30222 > 192.168.100.144.8080: Flags [P.], ack 1, win 115, options [nop,nop,TS val 1622020597 ecr 2170246269], length 4
15:26:47.013799 IP 192.168.100.144.8080 > 192.168.100.23.30222: Flags [R], seq 1114608111, win 0, length 0
15:26:47.019366 IP 192.168.100.144.8080 > 192.168.100.23.30222: Flags [P.], ack 5, win 1040, options [nop,nop,TS val 2170248157 ecr 1622020597], length 128
15:26:47.020353 IP 192.168.100.144.8080 > 192.168.100.23.30222: Flags [F.], seq 129, ack 5, win 1040, options [nop,nop,TS val 2170248158 ecr 1622020597], length 0
15:26:48.565166 IP 192.168.100.23.30222 > 192.168.100.144.8080: Flags [R], seq 2020028676, win 0, length 0

This TCP request was both RST'ed _and_ replied. Using net.inet.tcp.log_debug=1, we saw in debug log:

Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080; syncache_socket: in_pcbconnect failed with error 48
Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080 tcpflags 0x10<ACK>; tcp_input: Listen socket: Socket allocation failed due to limits or memory shortage, sending RST
Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080 tcpflags 0x4<RST>; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored

By adding more debug log in kernel (see joined patch), the origin of this unexpected behaviour seems to be in ixgbe driver:

Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080 tcpflags 0x18<PUSH,ACK>; syncache_expand: SEQ 2020028672, ACK 1114608111, syncache_socket() succeed
Feb 26 15:26:47 atlas-dl360-3 kernel: #0 0xffffffff80796b86 at syncache_expand+0x546
Feb 26 15:26:47 atlas-dl360-3 kernel: #1 0xffffffff8078e36b at tcp_input+0xf4b
Feb 26 15:26:47 atlas-dl360-3 kernel: #2 0xffffffff8071feec at ip_input+0xac
Feb 26 15:26:47 atlas-dl360-3 kernel: #3 0xffffffff806cb11e at netisr_dispatch_src+0x7e
Feb 26 15:26:47 atlas-dl360-3 kernel: #4 0xffffffff806c11bd at ether_demux+0x14d
Feb 26 15:26:47 atlas-dl360-3 kernel: #5 0xffffffff806c15c7 at ether_input+0x197
Feb 26 15:26:47 atlas-dl360-3 kernel: #6 0xffffffff803e691b at ixgbe_rxeof+0x1eb
Feb 26 15:26:47 atlas-dl360-3 kernel: #7 0xffffffff803e7108 at ixgbe_msix_que+0xa8
Feb 26 15:26:47 atlas-dl360-3 kernel: #8 0xffffffff805e7674 at intr_event_execute_handlers+0x104
Feb 26 15:26:47 atlas-dl360-3 kernel: #9 0xffffffff805e8d05 at ithread_loop+0x95
Feb 26 15:26:47 atlas-dl360-3 kernel: #10 0xffffffff805e488f at fork_exit+0x11f
Feb 26 15:26:47 atlas-dl360-3 kernel: #11 0xffffffff808edc7e at fork_trampoline+0xe
Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080; syncache_socket: in_pcbconnect failed with error 48
Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080 tcpflags 0x10<ACK>; syncache_expand: SEQ 2020028672, ACK 1114608111, syncache_socket() failed
Feb 26 15:26:47 atlas-dl360-3 kernel: #0 0xffffffff80796b86 at syncache_expand+0x546
Feb 26 15:26:47 atlas-dl360-3 kernel: #1 0xffffffff8078e36b at tcp_input+0xf4b
Feb 26 15:26:47 atlas-dl360-3 kernel: #2 0xffffffff8071feec at ip_input+0xac
Feb 26 15:26:47 atlas-dl360-3 kernel: #3 0xffffffff806cb11e at netisr_dispatch_src+0x7e
Feb 26 15:26:47 atlas-dl360-3 kernel: #4 0xffffffff806c11bd at ether_demux+0x14d
Feb 26 15:26:47 atlas-dl360-3 kernel: #5 0xffffffff806c15c7 at ether_input+0x197
Feb 26 15:26:47 atlas-dl360-3 kernel: #6 0xffffffff803e691b at ixgbe_rxeof+0x1eb
Feb 26 15:26:47 atlas-dl360-3 kernel: #7 0xffffffff803e7c31 at ixgbe_handle_que+0xd1
Feb 26 15:26:47 atlas-dl360-3 kernel: #8 0xffffffff8064e2c5 at taskqueue_run_locked+0x85
Feb 26 15:26:47 atlas-dl360-3 kernel: #9 0xffffffff8064e45e at taskqueue_thread_loop+0x4e
Feb 26 15:26:47 atlas-dl360-3 kernel: #10 0xffffffff805e488f at fork_exit+0x11f
Feb 26 15:26:47 atlas-dl360-3 kernel: #11 0xffffffff808edc7e at fork_trampoline+0xe
Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080 tcpflags 0x10<ACK>; tcp_input: Listen socket: Socket allocation failed due to limits or memory shortage, sending RST
Feb 26 15:26:47 atlas-dl360-3 kernel: TCP: [192.168.100.23]:30222 to [192.168.100.144]:8080 tcpflags 0x4<RST>; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored

Here we saw the concurrency between ixgbe_handle_que() and ixgbe_msix_que() both calling ixgbe_rxeof() at same time, and the client <PUSH,ACK> packet being proceed before the client <ACK> packet.

If we are currently testing FreeBSD 8.3, we saw nothing that would prevent this issue in 9.1 and in CURRENT.
>How-To-Repeat:
Load a TCP service enough to see the "syncache_socket: in_pcbconnect failed with error 48" error.
>Fix:


Patch attached with submission follows:

Index: sys/netinet/tcp_syncache.c
===================================================================
--- sys/netinet/tcp_syncache.c	(revision 50)
+++ sys/netinet/tcp_syncache.c	(working copy)
@@ -97,6 +97,11 @@
 
 #include <security/mac/mac_framework.h>
 
+#include <sys/param.h>
+#include <sys/stack.h>
+#include <sys/types.h>
+#include <sys/sbuf.h>
+
 static VNET_DEFINE(int, tcp_syncookies) = 1;
 #define	V_tcp_syncookies		VNET(tcp_syncookies)
 SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, syncookies, CTLFLAG_RW,
@@ -200,6 +205,11 @@
 #define	SCH_UNLOCK(sch)		mtx_unlock(&(sch)->sch_mtx)
 #define	SCH_LOCK_ASSERT(sch)	mtx_assert(&(sch)->sch_mtx, MA_OWNED)
 
+struct stack *st;
+struct mtx st_mtx;
+struct sbuf st_sb;
+char st_trace[65536];
+
 /*
  * Requires the syncache entry to be already removed from the bucket list.
  */
@@ -223,6 +233,10 @@
 {
 	int i;
 
+	st = stack_create();
+	mtx_init(&st_mtx, "tcp_sc_stack", NULL, MTX_DEF);
+	sbuf_new(&st_sb, st_trace, sizeof(st_trace), SBUF_FIXEDLEN);
+
 	V_tcp_syncache.cache_count = 0;
 	V_tcp_syncache.hashsize = TCP_SYNCACHE_HASHSIZE;
 	V_tcp_syncache.bucket_limit = TCP_SYNCACHE_BUCKETLIMIT;
@@ -277,6 +291,10 @@
 	struct syncache *sc, *nsc;
 	int i;
 
+	sbuf_delete(&st_sb);
+	mtx_destroy(&st_mtx);
+	stack_destroy(st);
+
 	/* Cleanup hash buckets: stop timers, free entries, destroy locks. */
 	for (i = 0; i < V_tcp_syncache.hashsize; i++) {
 
@@ -949,11 +967,27 @@
 
 	*lsop = syncache_socket(sc, *lsop, m);
 
-	if (*lsop == NULL)
+	if (*lsop == NULL) {
+		if ((s = tcp_log_addrs(inc, th, NULL, NULL)))
+			log(LOG_DEBUG, "%s; %s: SEQ %u, ACK %u, syncache_socket() "
+			    "failed\n", s, __func__, th->th_seq, th->th_ack);
 		TCPSTAT_INC(tcps_sc_aborted);
-	else
+        }
+	else {
+		if ((s = tcp_log_addrs(inc, th, NULL, NULL)))
+			log(LOG_DEBUG, "%s; %s: SEQ %u, ACK %u, syncache_socket() "
+			    "succeed\n", s, __func__, th->th_seq, th->th_ack);
 		TCPSTAT_INC(tcps_sc_completed);
+	}
 
+	mtx_lock(&st_mtx);
+	stack_save(st);
+	stack_sbuf_print(&st_sb, st);
+	sbuf_finish(&st_sb);
+	log(LOG_DEBUG, "%s", sbuf_data(&st_sb));
+	sbuf_clear(&st_sb);
+	mtx_unlock(&st_mtx);
+
 /* how do we find the inp for the new socket? */
 	if (sc != &scs)
 		syncache_free(sc);


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list