Deadlock, exclusive sx so_rcv_sx, amd64
Robert Watson
rwatson at FreeBSD.org
Fri Oct 26 14:42:14 PDT 2007
On Fri, 26 Oct 2007, John Baldwin wrote:
> "sbwait" is waiting for data to come in on a socket and "pfault" is waiting
> on disk I/O. It is a bit odd that 1187 is holding a lock while sleeping
> though that is permitted with an sx lock. Still, if it's supposed to be
> protect socket's receive buffer that is odd. Maybe get a trace of the
> process blocked in "sbwait" (tr <pid>) and bug rwatson@ about it.
This is normal -- there are two kinds of locks on each socket buffer: a mutex
protecting the integrity of the data structure, and an sx lock serializing I/O
on the socket buffer. The latter is intended to prevent I/O interlacing, and
replaced the older sblock/sbunlock implemented using tsleep(), flags, and the
mutex as an interlock. It is normal for the sx lock to be held over sleeps --
both sbwait, indicating that the I/O has not yet been completed but is waiting
on the network or remote endpoint, and a page fault, indicating that a data
copy to or from user space is in progress and has blocked waiting on paging.
Other threads blocked on the sx lock sleep interruptibly, thanks for Attilio's
addition of interruptible sx lock calls.
It's not impossible that there are deadlocks involved, but if so, they likely
existed before the change to formal sx locks as the previous "by hand" lock
construction had essentially identical (but slower) properties. There is an
interesting question about whether the strong semantics in the presence of
interlaced I/O requests (i.e., simultaneous requests from multiple threads on
a single socket) are required, in which case we might be able to weaken the
locking here with some reworking of the socket buffer data structures and
send/receive routines. For the time being we should leave them as-is for
stream sockets, and have optimized them out for UDP sockets by virtue of a
simplified sosend_dgram(), which was part of our optimization work for BIND.
FYI, BIND uses a single UDP socket for all transactions, and since each
transaction is atomic (being a datagram), the overhead of socket buffer
locking was significant, not to mention unrequired. This was problem was
originally pointed out by Jinmei Tatuya.
So, in summary: sleeping while holding the so_rcv/so_snd sx locks is normal,
but deadlocks are not, so if the pointer comes back in the direction of the
socket code after some more investigation, let me know.
Robert N M Watson
Computer Laboratory
University of Cambridge
More information about the freebsd-current
mailing list