HEADS UP: inpcb/inpcbinfo rwlocking: coming to a 7-STABLE branch near you

Robert Watson rwatson at FreeBSD.org
Mon Aug 18 08:14:21 UTC 2008

On Sun, 3 Aug 2008, Robert Watson wrote:

> This is an advance warning that, late next week, I will be merging a fairly 
> large set of changes to the IPv4 and IPv6 protocols layered over the 
> inpcb/inpcbinfo kernel infrastructure.  To be specific, this affects TCP, 
> UDP, and raw sockets on both IPv4 and IPv6.  I will post a further e-mail 
> announcement along with patch set and schedule in a day or two once it's 
> prepared.

FYI: This patch has now been committed to Subversion.  I'll keep a close eye 
out for difficulties; if you run into issues, please send me an e-mail (and CC 


Robert N M Watson
Computer Laboratory
University of Cambridge

> The thrust of this change is to replace the mutexes protecting the inpcb and 
> inpcbinfo data structures with read-write locks (rwlocks).  These structures 
> represent, respectively, particular sockets and the global socket lists for 
> all socket types in IPv4 and IPv6 except for SCTP.  When you run netstat, 
> inpcbinfo is the data structure referencing all connections, and each line in 
> the nestat output reflects the contents of a specific inpcb.
> In the current stage of this work, the intent is to improve performance for 
> datagram-related protocols on SMP systems by allowing concurrent acquisition 
> of both global and connection locks during receive and transmit.  This is 
> possible because, in the common case, no connection or global state is 
> modified during UDP/raw receive and transmit at the IP layer, so a read lock 
> is sufficient to prevent data in those structures from unexpectedly changing. 
> For receive, socket layer state is modified, but this is separately protected 
> by socket layer locks.  On transmit, no state is modified at any layer, so in 
> principle we will allow fully parallel transmit from multiple threads down to 
> about the routing and network interface layers, whereas previously they would 
> bottleneck in UDP.
> The applications targeted by this change are threaded UDP server 
> applications, such as BIND9, nsd, and UDP-based memcached.  Kris Kennaway and 
> Paul Saab have done fairly extensive testing with the changes and 
> demonstrated significant performance improvements due to reduced contention 
> and overhead.  Perhaps they can mention some of those numbers in a follow-up 
> to this post.
> The reason for the heads up is that, while carefully-tested, changes of this 
> sort do come with risks.  We've carefully structured them so as to avoid 
> breaking the ABIs for netstat, etc, but it's not impossible that some 
> problems will arise as the changes settle.  The goal, however, is to see 
> these performance improvements in 7.1, and since they've had a bit to shake 
> out in 8.x and seen some heavy use, I think now is the right time to merge 
> them.
> In any case, I will send out e-mail in a couple of days with a proposed merge 
> patch and schedule for merging, and perhaps if you are in a positition where 
> you might benefit from these improvements, or have interesting UDP or 
> raw-socket based applications running on 7.x, you could test the candidate 
> patch before it's merged, reporting any problems.  Unless I receive negative 
> feedback, I will plan on merging the changes late in the week, and keep a 
> close eye on stable@ for any reports of problems.
> Thanks,
> Robert N M Watson
> Computer Laboratory
> University of Cambridge
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"

More information about the freebsd-stable mailing list