kern/185812: send(2) on a UNIX domain SEQPACKET socket returns EMSGSIZE instead of EAGAIN

Alan Somers asomers at freebsd.org
Mon Feb 17 22:10:31 UTC 2014


SOCK_SEQPACKET Unix domain sockets don't work on FreeBSD.  kern/185812
is one of the reasons.  When send(2) ought to block on a seqpacket
socket, it returns EMSGSIZE instead.  If the socket is nonblocking,
send(2) returns EMSGSIZE instead of EAGAIN.  The problem can be
demonstrated on FreeBSD 10 or head by the ATF testcase
sys/kern/unix_seqpacket_test:eagain_8k_8k.

The problem dates to an old hack.  It's at least as old as 4.4BSD
Lite.  When you write to a unix domain socket, the data goes directly
to the receiving socket's sockbuf, bypassing the sending socket's
sockbuf..  However, sosend_generic doesn't know anything about Unix
domain sockets, and it doesn't know anything about the receiving
socket.  Without some form of backpressure, sosend_generic would never
block.  So, uipc_send updates the _sending_ sockbuf's sb_hiwat to
account for whatever it wrote to the _receiving_ sockbuf.  (For those
not in the know, sb_hiwat is the maximum allowed amount of data in the
buffer.)  The next time that sosend_generic gets called, it sees that
the sending sockbuf is empty, but it has a lower maximum size than
before.  If the maximum size is 0, sosend_generic will block.  This
hack worked fine for SOCK_STREAM sockets, but it breaks SOCK_SEQPACKET
sockets, since the latter consist of messages that must be sent
atomically.  When sb_hiwat is too low to fit an entire message,
sosend_generic will return EMSGSIZE instead of blocking or returning
EAGAIN.

Fortunately, we have a template for how to fix this bug.  DragonFlyBSD
fixed it back in 2008.  Instead of applying backpressure through
sb_hiwat, it uses a new sockbuf flag called SSB_STOP.   When the
receiving sockbuf runs out of space, uipc_send sets SSB_STOP on the
sending sockbuf.  Then, sosend_generic will block (or return EAGAIN)
on the next attempt to write.  This solution is very clean and simple.
 It might also be slightly faster than the legacy method, because it
eliminates the need to call chgsbsize() on every send() and recv().  I
am aware of one drawback: since ssb_space() will only ever return 0 or
ssb_hiwat, sosend_generic will allow the sockbuf to exceed its nominal
maximum size by at most one packet of size less than ssb_hiwat.  I
don't think that's a serious problem.  In fact, I'm not even positive
that FreeBSD guarantees a socket will always stay within its nominal
size limit.

Does this solution sound acceptable in FreeBSD?  Is there any reason
that I shouldn't port it?  Note that DragonFly long ago refactored
struct sockbuf into two separate structures: struct sockbuf and struct
signalsockbuf.  I won't make that change as part of the port.

In case you're wondering, NetBSD 6.0 suffers from the same bug,
OpenBSD 5.4 doesn't appear to support SOCK_SEQPACKET unix domain
sockets, and Linux 3.2.0 does not suffer.

The relevant commit in DragonFlyBSD:
https://github.com/DragonFlyBSD/DragonFlyBSD/commit/3a6117bbe0ed6a87605c1e43e12a1438d8844380

-Alan


More information about the freebsd-net mailing list