svn commit: r196460 - head/sys/kern

Tue Aug 25 21:08:17 UTC 2009

On Tue, Aug 25, 2009 at 04:07:11AM +1000, Bruce Evans wrote:
> On Sun, 23 Aug 2009, Jilles Tjoelker wrote:

> > I think poll on fifos should instead be fixed by closing the
> > half-connection corresponding to writing from fi_readsock to
> > fi_writesock. I have tried this out, see attached patch. With the patch,
> > pipepoll only gives "expected POLLHUP; got POLLIN | POLLHUP" errors and
> > an error in fifo case 6b caused by our distinction between new and old
> > readers.

> This sort of worked for me, but has several problems:

> % Index: sys/fs/fifofs/fifo_vnops.c
> % ===================================================================
> % --- sys/fs/fifofs/fifo_vnops.c	(revision 196459)
> % +++ sys/fs/fifofs/fifo_vnops.c	(working copy)
> % @@ -193,6 +193,10 @@
> %  			goto fail2;
> %  		fip->fi_writesock = wso;
> %  		error = soconnect2(wso, rso);
> % +		if (error == 0)
> % +			error = soshutdown(rso, SHUT_WR);
> % +		if (error == 0)
> % +			error = soshutdown(wso, SHUT_RD);
> %  		if (error) {
> %  			(void)soclose(wso);
> %  fail2:

> The second soshutdown() is only harmful.  I see the following state changes
> - the first soshutdown() sets SBS_CANTRCVMORE on rso like you would expect,
>    and also sets SBS_CANTSENDMORE on wso.  This gives the desired state.
> - the second soshutdown() then clears SBS_CANTRCVMORE on rso (without
>    the first soshutdown() it leaves both flags clear in both directions).
>    This clobbers the desired state.  The failure shows in just one of my
>    uncommitted regression tests (when there is a writer and there was
>    a reader, poll() returns POLLOUT for the writer, but should return
>    POLLHUP; the missing SBS_CANTRCVMORE on rso prevents it ever returning
>    POLLHUP for writers).

> After removing the second soshutdown() and fixing a spurious POLLIN (see
> below), all my tests pass.

I have removed the second shutdown, it is not necessary.

> Elsewhere, fifo_vnops.c hacks on SBS_CANT*MORE directly.  Perhaps it should
> call soshutdown(), or if the direct access there is safe then it is probably
> safe above.

That's for a later commit to fix.

> % Index: sys/kern/uipc_socket.c
> % ===================================================================
> % --- sys/kern/uipc_socket.c	(revision 196469)
> % +++ sys/kern/uipc_socket.c	(working copy)
> % @@ -2898,11 +2898,13 @@
> %  		if (so->so_oobmark || (so->so_rcv.sb_state & SBS_RCVATMARK))
> %  			revents |= events & (POLLPRI | POLLRDBAND);
> % 
> % -	if ((events & POLLINIGNEOF) == 0)
> % -		if (so->so_rcv.sb_state & SBS_CANTRCVMORE)
> % -			revents |= POLLHUP;
> % -	if (so->so_snd.sb_state & SBS_CANTSENDMORE)
> % -		revents |= POLLHUP;
> % +	if ((events & POLLINIGNEOF) == 0) {

> Old problems become larger:

> I don't like POLLINIGNEOF (for input) affecting POLLHUP for output.  This
> seems to cause no problems for fifos, at least when the kernel sets
> POLLINIGNEOF, but it is hard to understand even why it doesn't cause
> problems, and kib@ wants POLLINIGNEOF to remain user-settable, so the
> complications might remain exported to userland for for fifos and
> sockets, where they are harder to document and understand.

I do not like userland POLLINIGNEOF either. I think programs can do fine
with the standard functionality (closing and reopening a fifo to reset
the POLLHUP state).

> % +			revents |= events & (POLLIN | POLLRDNORM);

> This gives spurious POLLINs when POLLHUP is also returned, and thus defeats
> the point of soreadable_data() being different from soreadable().  Tests
> 6a-6d show the spurious POLLIN.  I don't understand how tests 6a and 6c-6d
> passed for you.

Same problem here. I think kib@ wants to keep this in 8.x for the sake
of buggy programs that do not check for POLLHUP. I suppose we can do it
properly in 9.x.

> % +			if (so->so_snd.sb_state & SBS_CANTSENDMORE)
> % +				revents |= POLLHUP;

> Tests 6a-6d pass with the above 3 lines changed to:

>  			if (so->so_snd.sb_state & SBS_CANTSENDMORE)
>  				revents |= POLLHUP;
>  			else
>  				revents |= events & (POLLIN | POLLRDNORM);

> Returning POLLIN will cause poll() to not block on input descriptors, but
> this seems to be as correct as possible since there really is a read-EOF.
> Applications just can't see this EOF as POLLHUP -- they will see POLLIN
> and have to try to read(), and then interpret read() returning 0 as meaning
> EOF.  Device-independent applications must do precisely this anyway since
> the input descriptor might be a regular file and there is no POLLHUP for
> regular files.

Yes. Even more, any program must handle it because it is also possible
that an EOF happens between poll() and read().

-- 
Jilles Tjoelker