kern/176446: [netinet] [patch] Concurrency in ixgbe driving out-of-order packet process and spurious RST

Charbon, Julien jcharbon at verisign.com
Tue Mar 12 17:20:01 UTC 2013


The following reply was made to PR kern/176446; it has been noted by GNATS.

From: "Charbon, Julien" <jcharbon at verisign.com>
To: Cc: bug-followup at freebsd.org,
        "De La Gueronniere, Marc" <mdelagueronniere at verisign.com>
Subject: Re: kern/176446: [netinet] [patch] Concurrency in ixgbe driving out-of-order
 packet process and spurious RST
Date: Tue, 12 Mar 2013 14:59:11 +0100

 This is a multi-part message in MIME format.
 --------------050609010000050001050509
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 Content-Transfer-Encoding: 7bit
 
 On 3/7/13 11:11 AM, Charbon, Julien wrote:
 > On 2/28/13 8:10 PM, Charbon, Julien wrote:
 >> On 2/28/13 4:57 PM, John Baldwin wrote:
 >>> Can you try the fixes from http://svnweb.freebsd.org/base?view=revision&revision=240968?
 >>
 >>     Actually, Marc (I CC'ed him) did find the r240968 fix for concurrency
 >> between ixgbe_msix_que() and ixgbe_handle_que(), and made a backport for
 >> release-8.3.0 (see patch [1] below).  However, the issue was still
 >> reproducible, then Marc found another place for concurrency from
 >> ixgbe_local_timer() and fix it (see patch [2]).  But it was still not
 >> enough, and he found a last place for concurrency due to
 >> ixgbe_rearm_queues() call (see patch [3]).  We all these patches
 >> applied, we were not able to reproduce this issue.
 >
 >    Just for the record:  As expected this issue is reproducible on
 > 9.1-RELEASE:
 
      Just for the record:  As expected this issue is reproducible also on
 10.0-CURRENT:
 
 # uname -a
 FreeBSD atlas 10.0-CURRENT FreeBSD 10.0-CURRENT #0 r248173M: Tue Mar 12 
 07:52:58 UTC 2013 
 jcharbon at atlas:/usr/obj/app/jcharbon/head/sys/GENERIC  amd64
 
   1. Enable TCP debug log:
 
 # sysctl net.inet.tcp.log_debug=1
 net.inet.tcp.log_debug: 1 -> 1
 
   2. Load a TCP service with numerous small requests/responses:
 
   3. Look in /var/log/debug.log for the pattern:
 
 Mar 12 10:31:22 atlas kernel: TCP: [192.168.100.35]:4698 to 
 [192.168.100.152]:8080; syncache_socket: in_pcbconnect failed with error 48
 Mar 12 10:31:22 atlas-dl360-4 kernel: TCP: [192.168.100.35]:4698 to 
 [192.168.100.152]:8080 tcpflags 0x10<ACK>; tcp_input: Listen socket: 
 Socket allocation failed due to limits or memory shortage, sending RST
 Mar 12 10:31:22 atlas-dl360-4 kernel: TCP: [192.168.100.35]:4698 to 
 [192.168.100.152]:8080 tcpflags 0x4<RST>; syncache_chkrst: Spurious RST 
 without matching syncache entry (possibly syncookie only), segment ignored
 
   Joined the patch we use to fix this issue in 10-CURRENT.
 
 --
 Julien
 
 --------------050609010000050001050509
 Content-Type: text/plain; charset=UTF-8; x-mac-type="0"; x-mac-creator="0";
  name="ixgbe.c.patch"
 Content-Transfer-Encoding: 7bit
 Content-Disposition: attachment;
  filename="ixgbe.c.patch"
 
 Index: sys/dev/ixgbe/ixgbe.c
 ===================================================================
 --- sys/dev/ixgbe/ixgbe.c	(revision 248173)
 +++ sys/dev/ixgbe/ixgbe.c	(working copy)
 @@ -2038,14 +2038,14 @@
  		    (paused == 0))
  			++hung;
  		else if (txr->queue_status == IXGBE_QUEUE_WORKING)
 -			taskqueue_enqueue(que->tq, &que->que_task);
 +			taskqueue_enqueue(que->tq, &txr->txq_task);
          }
  	/* Only truely watchdog if all queues show hung */
          if (hung == adapter->num_queues)
                  goto watchdog;
  
  out:
 -	ixgbe_rearm_queues(adapter, adapter->que_mask);
 +	// ixgbe_rearm_queues(adapter, adapter->que_mask);
  	callout_reset(&adapter->timer, hz, ixgbe_local_timer, adapter);
  	return;
  
 @@ -4575,7 +4575,7 @@
  	** Schedule another interrupt if so.
  	*/
  	if ((staterr & IXGBE_RXD_STAT_DD) != 0) {
 -		ixgbe_rearm_queues(adapter, (u64)(1 << que->msix));
 +		// ixgbe_rearm_queues(adapter, (u64)(1 << que->msix));
  		return (TRUE);
  	}
  
 
 --------------050609010000050001050509--


More information about the freebsd-net mailing list