kern/94772: FIFOs (named pipes) + select() == broken

Sun Mar 26 12:00:42 UTC 2006

The following reply was made to PR kern/94772; it has been noted by GNATS.

From: Bruce Evans <bde at zeta.org.au>
To: Oliver Fromme <olli at lurza.secnetix.de>
Cc: bug-followup at freebsd.org
Subject: Re: kern/94772: FIFOs (named pipes) + select() == broken
Date: Sun, 26 Mar 2006 22:50:17 +1100 (EST)

 On Fri, 24 Mar 2006, Oliver Fromme wrote:

 I'm still catching up with your mail on Thursday-Friday.  This
 and the one with the main patch.  I tested and debugged the
 patch and found a few problems and many more complications...

 > Bruce Evans wrote:
 > > Oliver Fromme wrote:
 > > > So you mean in the SBS_CANTSENDMORE case, POLLHUP should be
 > > > set without checking if the caller has requested POLLOUT in
 > > > the events mask?  That sounds reasonable, because POLLOUT
 > > > certainly can't be returned in that case.  It makes the
 > > > code more complex, though.
 > >
 > > Yes.  POLLHUP Is also needed for making poll() return for poll()
 > > waiting for input only.  I think it would make the code slightly
 > > less complex.
 >
 > You're right.  My patch made that part of the code slightly
 > less complex, indeed.

 It tests both SBS_CANTSENDMORE and SBS_CANTRCVMORE.  Testing both
 seems to be needed, but after my changes things got more complicated
 again.  For fifos there are 2 sockets each with these 2 flags, so
 there are 2**4 combinations of flags to consider.  When we set
 POLLHUP we are supposed to not set POLLOUT, but even when we force
 this in sopoll() we have to worry about fifo_poll() ORing POLLUP
 for the read socket together with POLLOUT for the write socket.
 Anyway, userland is not ready for POLLHUP, so I think we shouldn't
 add it to sopoll() yet.

 > > I'm interested in what non-Linux non-FreeBSD systems do.
 >
 > DEC UNIX 4.0D doesn't return POLLHUP at all, only POLLIN.
 > ...
 > Solaris 9 seems to behave exactly the same as Linux in the
 > ...
 >
 > NetBSD 3.0 is very interesting, so I give the detailed
 > output from the test program (which I modified to produce
 > regression test compliant output, see my other mail):

 I've only looked at NetBSD-2.0.1 sources.  These seem to still have
 some of the bugs in 4.4BSD that I fixed.  NetBSD-3.0 seems to be better.

 > 1..26
 > ok 1      Pipe state 4: expected 0; got 0
 > ok 2      Pipe state 5: expected POLLIN; got POLLIN
 > ok 3      Pipe state 6: expected POLLIN | POLLHUP; got POLLIN | POLLHUP
 > not ok 4  Pipe state 6a: expected POLLHUP; got POLLIN | POLLHUP

 I think we'll need to go back to this (always return POLLIN with POLLHUP).
 I found that lat_rpc in lmbench2 is broken without this.  At least in my
 old version of libc, libc/rpc uses poll() a lot, and it doesn't understand
 POLLHUP.  E.g., at EOF read_vc() spins forever waiting for POLLIN unless
 POLLIN is set together with POLLHUP.

 > ok 5      Pipe state 4: expected 0; got 0
 > ok 6      Pipe state 5: expected POLLIN; got POLLIN
 > ok 7      Pipe state 6: expected POLLIN | POLLHUP; got POLLIN | POLLHUP
 > not ok 8  Pipe state 6a: expected POLLHUP; got POLLIN | POLLHUP

 Same.

 > ok 9      FIFO state 0: expected 0; got 0
 > ok 10     FIFO state 1: expected 0; got 0
 > ok 11     FIFO state 2: expected POLLIN; got POLLIN
 > ok 12     FIFO state 2a: expected 0; got 0
 > not ok 13 FIFO state 3: expected POLLHUP; got POLLIN

 Similarly.  I changed your patches to return both POLLHUP and POLLIN here.
 (This required complications to zap POLLIN as well as POLLHUP in state 0.)
 I thought that returning POLLHUP would be harmless, but it isn't for
 output since returning POLLHUP requires not returning POLLOUT so
 pgrams that don't understand POLLHUP might spin at EOF for write by
 waiting for POLLOUT.

 > ok 14     FIFO state 4: expected 0; got 0
 > ok 15     FIFO state 5: expected POLLIN; got POLLIN
 > not ok 16 FIFO state 6: expected POLLIN | POLLHUP; got POLLIN

 Similarly.  For this state, we could fix the bug in gdb (premature exit
 on POLLHUP when POLLIN is also set and actually indicates non-null data)
 by returning only POLLIN.  This would only work for polling for readability.
 For writability, POLLHUP needs to be returned synchronously if at all, to
 give the application a chance of avoiding a write that would fail.
 select()'s interface, and returning POLLOUT on EOF, presumably results in
 lots of processes killed by SIGPIPE when they try such a write.

 > not ok 17 FIFO state 6a: expected POLLHUP; got POLLIN

 Same as for pipes.

 [... same for second iteration]

 > That means two things:
 > 1.  When POLLHUP is returned, POLLIN is also always
 >    returned.
 > 2.  For FIFOs, POLLHUP is not used at all, but POLLIN
 >    is used instead.  This is the behaviour that Stevens
 >    describes in APUE, by the way.
 >
 > I guess portable programs cannot rely on the results from
 > poll() too much ...  They probably just look if at least
 > one of POLLHUP and POLLIN is set, and then call read().
 > Otherwise they would break on one platform or another.

 Not supporting POLLHUP for pipes and fifos seems best.  We have
 to set POLLIN on EOF since too many programs only look at POLLIN.
 Then setting POLLHUP doesn't gain much.  It's strange to support
 POLLHUP for pipes but not for fifos.  It is easier to support for
 pipes but more useful for fifos.

 > Here's a web page from someone who did similar tests on
 > a wide range of operating systems:
 >
 > http://www.greenend.org.uk/rjk/2001/06/poll.html
 >
 > His conclusions are a little bit different.  *SIGH*
 > It's all the fault of fuzzy SUS/POSIX.  :-(

 Urk.  It shows about 50 variations in 12 OS's without even checking
 fifos.

 We need more regression tests for sockets if we're going to change
 sopoll() significantly.  I hacked the tests to check socketpair()
 (just change pipe() to socketpair(...)).  Pipes were once just
 socketpairs but are now handled specially, and this gives more
 variations.  Fortunately not many.  Before your changes, there are
 no differences for select(), and for poll() there are these:

 before:
 < ok 3      Pipe state 6: expected POLLIN | POLLHUP; got POLLIN | POLLHUP
 < not ok 4  Pipe state 6a: expected POLLHUP; got POLLIN | POLLHUP
 after:
 > not ok 3  Socketpair state 6: expected POLLIN | POLLHUP; got POLLIN
 > not ok 4  Socketpair state 6a: expected POLLHUP; got POLLIN

 We just lose all setting of POLLHUP, and this only makes a difference
 here.  (State 6a is the only problem case for pipes and socketpair()
 has this and a problem with state 6 too.)

 After your changes there are no differences for pipes and socketpairs.

 With my version of your changes there is a difference for state 6a again:

 before:
 < not ok 4  Pipe state 6a: expected POLLHUP; got POLLIN | POLLHUP
 after:
 > ok 4      Socketpair state 6a: expected POLLHUP; got POLLHUP

 My changes are supposed to always set POLLIN with POLLHUP (giving "not ok"
 in state 6a), and they somehow do that in sopoll() for fifos but not for
 socketpairs.

 Linux-2.6.10 has the following problem cases:

 select();
 % not ok 9  FIFO state 0: expected set; got clear

 Linux apparently doesn't have a special case for state 0 in fifos
 (reader with no data, no writer and no disconnection) -- it has the
 same behaviour in this state for select() as for poll() although this
 behaviour is clearly nonstandard for select().

 poll():
 not ok 4  Socketpair state 6a: expected POLLHUP; got POLLIN | POLLHUP

 In this state (reader with no data and a disconnection), Linux has
 simpler behaviour that is inconsistent withe Linux' pipe().

 I don't know socket programming well enough to quickly write similar
 tests for general connections.

 Bruce