TCP stack lock contention with short-lived connections

Julien Charbon jcharbon at
Thu Nov 7 14:10:51 UTC 2013

  Hi list,

On Mon, 04 Nov 2013 22:21:04 +0100, Julien Charbon <jcharbon at>  
>   just a follow-up of vBSDCon discussions about FreeBSD TCP performances  
> with short-lived connections.  In summary: <snip>
> I have put technical and how-to-repeat details in below PR:
> kern/183659: TCP stack lock contention with short-lived connections
>   We are currently working on this performance improvement effort;  it  
> will impact only the TCP locking strategy not the TCP stack logic  
> itself.  We will share on freebsd-net the patches we made for reviewing  
> and improvement propositions;  anyway this change might also require  
> enough eyeballs to avoid tricky race conditions introduction in TCP  
> stack.

  Just a follow-up:  We are currently removing TCP INP_INFO lock from  
places it is actually not required in order to mitigate the lock  
contention.  It seems to be a good first step in this effort:  Small  
changes, easy to review, low risk (and small gain... right).

  Below a first patch that removes INP_INFO lock from tcp_usr_accept():   
This changes simply follows the advice made in corresponding code  
comment:  "A better fix would prevent the socket from being placed in the  
listen queue until all fields are fully initialized."  For more technical  
details, check the comment in related change below:

  With this patch applied we see no regressions and a performance  
improvement of ~5% i.e with 9.2 vanilla kernel:  52k TCP Queries Per  
Second, with 9.2 + joined patch:  55k TCP QPS.  Not huge indeed but still  
an improvement.

  P.S.:  Funny enough it seems that the same change has already been  
proposed in the past:


From: Julien Charbon <jcharbon at>
Subject: [PATCH] Add new socket in listen queue only when fully initialized

  sys/netinet/tcp_syncache.c | 4 +++-
  sys/netinet/tcp_usrreq.c   | 9 ---------
  2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/sys/netinet/tcp_syncache.c b/sys/netinet/tcp_syncache.c
index af1651a..eb73356 100644
--- a/sys/netinet/tcp_syncache.c
+++ b/sys/netinet/tcp_syncache.c
@@ -660,7 +660,7 @@ syncache_socket(struct syncache *sc, struct socket  
*lso, struct mbuf *m)
  	 * connection when the SYN arrived.  If we can't create
  	 * the connection, abort it.
-	so = sonewconn(lso, SS_ISCONNECTED);
+	so = sonewconn(lso, 0);
  	if (so == NULL) {
  		 * Drop the connection; we will either send a RST or
@@ -890,6 +890,8 @@ syncache_socket(struct syncache *sc, struct socket  
*lso, struct mbuf *m)


+	soisconnected(so);
  	return (so);

diff --git a/sys/netinet/tcp_usrreq.c b/sys/netinet/tcp_usrreq.c
index b83f34a..566cc34 100644
--- a/sys/netinet/tcp_usrreq.c
+++ b/sys/netinet/tcp_usrreq.c
@@ -609,13 +609,6 @@ out:
   * Accept a connection.  Essentially all the work is done at higher  
   * just return the address of the peer, storing through addr.
- *
- * The rationale for acquiring the tcbinfo lock here is somewhat  
- * and is described in detail in the commit log entry for r175612.   
- * it delays an accept(2) racing with sonewconn(), which inserts the  
- * before the inpcb address/port fields are initialized.  A better fix  
- * prevent the socket from being placed in the listen queue until all  
- * are fully initialized.
  static int
  tcp_usr_accept(struct socket *so, struct sockaddr **nam)
@@ -632,7 +625,6 @@ tcp_usr_accept(struct socket *so, struct sockaddr  

  	inp = sotoinpcb(so);
  	KASSERT(inp != NULL, ("tcp_usr_accept: inp == NULL"));
-	INP_INFO_RLOCK(&V_tcbinfo);
  	if (inp->inp_flags & (INP_TIMEWAIT | INP_DROPPED)) {
  		error = ECONNABORTED;
@@ -652,7 +644,6 @@ tcp_usr_accept(struct socket *so, struct sockaddr  
-	INP_INFO_RUNLOCK(&V_tcbinfo);
  	if (error == 0)
  		*nam = in_sockaddr(port, &addr);
  	return error;

More information about the freebsd-net mailing list