why use INP_WLOCK instead of INP_RLOCK

Robert N. M. Watson rwatson at freebsd.org
Fri Mar 25 21:38:13 UTC 2011


On 25 Mar 2011, at 21:01, John Baldwin wrote:

> On Tuesday, February 01, 2011 12:54:33 am Jim wrote:
>> I am not sure if anybody has asked it before. I could not find answer by
>> doing rough search on Internet, if it is duplicate question, sorry in
>> advance.
>> 
>> My question is that, for getting socket options in tcp_ctloutput() in
>> tcp_usrreq.c, why do we need to do lock with INP_WLOCK(inp) as setting
>> socket options does. Why do we just use INP_RLOCK(inp), as it looks not
>> changing anything in tcp control block?
> 
> I think mostly it is just because no one has bothered to change it.  
> Realistically it probably won't make any noticable difference unless your 
> workload consists of doing lots of calls to getsockopt() but not sending any 
> actual traffic on the associated sockets. :)  (Almost all of the other 
> operations on a TCP connection require a write lock on the pcb.)

Just to reiterate John's point here: the critical performance paths for TCP both require the inpcb lock to be held exclusively (input and output), and socket options are typically called from the same user thread doing I/O, meaning that acquiring read locks instead of write locks is unlikely to make any measurable difference. However, in principle I believe most if not all getsockopt()'s in TCP should be fine with just a read lock, and for socket options used with UDP, there might well be some benefit to using a read lock, since most UDP operations use read locks and note write locks on the inpcb.

I should further note that Jeff Roberson has some exciting in-progress work to reduce transmit-input contention on the inpcb that appears to make quite a noticeable difference in improving TCP performance. We don't have much global lock contention currently when in the steady state, but the per-connection locks do get heavily contended. His work is similar to some work done in the Linux stack a year or two ago to defer input processing to the user thread rather than contending on the inpcb lock, if it's already held. Hopefully we'll see the results of that work in 9.0, and possibly backported to 8.x.

I also have a large pending patchset adding connection group support, and aligning software lookup tables with hardware work distribution via RSS, which is due to go in before 9.0.

Robert


More information about the freebsd-net mailing list