pipe/fifo code merged.

Thu Jan 12 12:52:04 UTC 2012

On Thu, 12 Jan 2012, Giovanni Trematerra wrote:

> thanks again to point me out those regression tests I missed in first place.
> I ran those tests and results were identical with patched and non
> patched kernel.
> So at least in that regard the patch doesn't introduce more regressions.
> There are some tests that fail. If you and others think it's worth to
> fix them I can
> take this on.

Matching the old behaviour is good enough for now.

> ...
> STOCK KERNEL 10.0-CURRENT
>
> [gianni at devbox: poll]% ./pipepoll
> 1..20
> ok 1      Pipe state 4: expected 0; got 0
> ok 2      Pipe state 5: expected POLLIN; got POLLIN
> ok 3      Pipe state 6: expected POLLIN | POLLHUP; got POLLIN | POLLHUP
> not ok 4  Pipe state 6a: expected POLLHUP; got POLLIN | POLLHUP
> ok 5      Sock state 4: expected 0; got 0
> ok 6      Sock state 5: expected POLLIN; got POLLIN
> ok 7      Sock state 6: expected POLLIN | POLLHUP; got POLLIN | POLLHUP
> not ok 8  Sock state 6a: expected POLLHUP; got POLLIN | POLLHUP
> ok 9      FIFO state 0: expected 0; got 0
> ok 10     FIFO state 1: expected 0; got 0
> ok 11     FIFO state 2: expected POLLIN; got POLLIN
> ok 12     FIFO state 2a: expected 0; got 0
> not ok 13 FIFO state 3: expected POLLHUP; got POLLIN | POLLHUP
> ok 14     FIFO state 4: expected 0; got 0
> ok 15     FIFO state 5: expected POLLIN; got POLLIN
> ok 16     FIFO state 6: expected POLLIN | POLLHUP; got POLLIN | POLLHUP
> not ok 17 FIFO state 6a: expected POLLHUP; got POLLIN | POLLHUP

Returning spurious POLLIN with POLLHUP (when there is no input available)
is not too serious.  It means that applications can't trust POLLIN alone.
They must also check POLLHUP (which they should check anyway) and then if
both are set they are reduced to read()ing the file to see if there is
any input, much like they have to do if they use select() instead of poll()
(since select() cannot distiginguish these conditions).  Applications that
have been converted from poll() to select() should still do the right
thing by still using read().  However, gdb was broken by this conversion:

     echo 'p 0' | gdb /bin/cat

This prints "Hangup detected on fd 0\n", then "error detected on stdin",
then exits with status 0.  It never sees its input.  It should do the
same thing as for normal input of 'p 0\n<terminal EOF>' -- that is see
the input and execute it, then see the EOF and not complain about it
(twice), then exit with status 0.  This is because it thinks that
POLLHUP implies that there is no more input.  It doesn't even check
POLLIN when it sees POLLHUP.  If it checked, then it would have to
worry about spurious POLLIN, but here the problem is the opposite --
there is non-spurious POLLIN, and non-spurions POLLHUP.  POLLHUP must
be acted on immediately when the kernel detected it, to unblock poll()
or select(), even when there is unread input, since otherwise any unread
input would leave the poll blocked forever.

> not ok 18 FIFO state 6b: poll result 0 expected 1. expected POLLHUP; got 0

This one is more interesting and delicate.  Linux fails in the same
way as FreeBSD here.  6b through 6d test that if there is a hackup
condition, then new and old readers all see it....

> not ok 19 FIFO state 6c: expected POLLHUP; got POLLIN | POLLHUP
> not ok 20 FIFO state 6d: expected POLLHUP; got POLLIN | POLLHUP

Back to spurious POLLIN.

... 6b tests a new reader and fails because the new reader doesn't see
the hangup condition.  6c tests that the old reader still sees the
hangup condition after the open for the new reader has (possibly and
erroneously) cleared it.  This passes.  6d tests that the old reader
still sees the hangup condition after the new reader has gone away.
This passes too.

Note that it is only possibly to get a new reader by opening with
O_NONBLOCK.  Otherwise, the open for the new reader blocks so the
difference between never having had a "connection" and having a
hangup condition for a previous connection cannot be seen.  Some
users want the new open to not see the hangup so that poll on it
blocks waiting for a new connection.  I want it to see the hangup
condition so that the condition only depends on the "file" state
and not on the timing of the opens.  Note that the hangup condition
is cleared when the last reader that can see it goes away.  Thus
there is only a difference in unusual cases.  There are 2 main types
of unusual cases:
- when there are races between herds of readers and writers.  Now it
   seems best to not have the hangup condition depend so much on the
   timing.
- when some readers intentionally don't go away on hangup, but wait
   for a new connection.  Such waiting is difficult either way.  Sticky
   hangup prevents blocking in poll on the old fd's.  Now it seems best
   for new readers to not see the old hangups.  My preferred behviour
   prevents this when the old readers don't go away.  They might stay
   either because you want them to, or because you you can't control
   them and their owner wants them to.
But very complicated setups that intentionally go near the unusual
cases probably need external synchronization to avoid races and
access control to prevent uncontrolled accesses changing the fifo
state and eating the i/o.  It is not just hangup that involves device
state.

> [gianni at devbox: poll]% ./pipeselect
> 1..20
> ok 1      Pipe state 4: expected clear; got clear
> ok 2      Pipe state 5: expected set; got set
> ok 3      Pipe state 6: expected set; got set
> ok 4      Pipe state 6a: expected set; got set
> ok 5      Sock state 4: expected clear; got clear
> ok 6      Sock state 5: expected set; got set
> ok 7      Sock state 6: expected set; got set
> ok 8      Sock state 6a: expected set; got set
> not ok 9  FIFO state 0: expected set; got clear

Everything except #9 for select() gives the expected results in my
version.  State 0 is just the state after initial open in O_RDONLY
mode.  See the large comment about this in pipeselect.c (it says
that select() must see POLLIN although poll() must not).  Perhaps
the comment is wrong or out of date.

> ok 10     FIFO state 1: expected clear; got clear
> ok 11     FIFO state 2: expected set; got set
> ok 12     FIFO state 2a: expected clear; got clear
> ok 13     FIFO state 3: expected set; got set
> ok 14     FIFO state 4: expected clear; got clear
> ok 15     FIFO state 5: expected set; got set
> ok 16     FIFO state 6: expected set; got set
> ok 17     FIFO state 6a: expected set; got set
> not ok 18 FIFO state 6b: expected set; got clear
> ok 19     FIFO state 6c: expected set; got set
> ok 20     FIFO state 6d: expected set; got set

select() passes all except #9 and #18 because it can't distinguish the
spurious POLLIN from POLLHUP.

I added some tests, mainly for POLLOUT.  Unfortunately, this patch
won't apply cleanly, because -current has some changes that I haven't
merged.  I forget if the kernel needs any changes to pass these.  Not
many anyway.  Output tests and the changes in -current are missing for
pipeselect.c.

% Index: pipepoll.c
% ===================================================================
% RCS file: /home/ncvs/src/tools/regression/poll/pipepoll.c,v
% retrieving revision 1.1
% diff -u -2 -r1.1 pipepoll.c
% --- pipepoll.c	12 Jul 2009 12:50:43 -0000	1.1
% +++ pipepoll.c	25 Aug 2009 13:58:28 -0000
% @@ -30,4 +30,7 @@
%  		result = "POLLIN";
%  		break;
% +	case POLLOUT:
% +		result = "POLLOUT";
% +		break;
%  	case POLLHUP:
%  		result = "POLLHUP";
% @@ -36,4 +39,13 @@
%  		result = "POLLIN | POLLHUP";
%  		break;
% +	case POLLIN | POLLOUT:
% +		result = "POLLIN | POLLOUT";
% +		break;
% +	case POLLIN | POLLOUT | POLLHUP:
% +		result = "POLLIN | POLLOUT | POLLHUP";
% +		break;
% +	case POLLOUT | POLLHUP:
% +		result = "POLLOUT | POLLHUP";
% +		break;
%  	default:
%  		asprintf(&ncresult, "%#x", events);
% @@ -81,10 +93,10 @@
%  	}
%  	pfd.fd = fd;
% -	pfd.events = POLLIN;
% +	pfd.events = POLLIN | POLLOUT;
% 
%  	if (filetype == FT_FIFO) {
%  		if (poll(&pfd, 1, 0) < 0)
%  			err(1, "poll");
% -		report(num++, "0", 0, pfd.revents);
% +		report(num++, "0", POLLOUT, pfd.revents);
%  	}
%  	kill(ppid, SIGUSR1);

Note that this is buggy.  The fifo is still opened O_RDONLY, so in
-current poll() returns 0, though it should probably return POLLERR
(see previous mail).  IIRC, this test for POLLOUT passes under Linux.
This verifies that Linux doesn't check the open mode for output.

The tests could be expanded to check intentionally that silly
combinations of flags and open modes don't work.  Except for the
above, they apparently avoid the silly combinations, else the change
to check the open mode would have caused more failures.

% @@ -104,5 +116,5 @@
%  	if (poll(&pfd, 1, 0) < 0)
%  		err(1, "poll");
% -	report(num++, "1", 0, pfd.revents);
% +	report(num++, "1", POLLOUT, pfd.revents);
%  	kill(ppid, SIGUSR1);
% 
% @@ -112,10 +124,10 @@
%  	if (poll(&pfd, 1, 0) < 0)
%  		err(1, "poll");
% -	report(num++, "2", POLLIN, pfd.revents);
% +	report(num++, "2", POLLIN | POLLOUT, pfd.revents);
%  	if (read(fd, buf, sizeof buf) != 1)
%  		err(1, "read");
%  	if (poll(&pfd, 1, 0) < 0)
%  		err(1, "poll");
% -	report(num++, "2a", 0, pfd.revents);
% +	report(num++, "2a",  POLLOUT, pfd.revents);
%  	kill(ppid, SIGUSR1);
% 
% @@ -140,5 +152,5 @@
%  	if (poll(&pfd, 1, 0) < 0)
%  		err(1, "poll");
% -	report(num++, "4", 0, pfd.revents);
% +	report(num++, "4", POLLOUT, pfd.revents);
%  	kill(ppid, SIGUSR1);
% 
% @@ -148,5 +160,5 @@
%  	if (poll(&pfd, 1, 0) < 0)
%  		err(1, "poll");
% -	report(num++, "5", POLLIN, pfd.revents);
% +	report(num++, "5", POLLIN | POLLOUT, pfd.revents);
%  	kill(ppid, SIGUSR1);
% 
% @@ -193,7 +205,16 @@
%  		report(num++, "6d", POLLHUP, pfd.revents);
%  	}
% -	close(fd);
%  	kill(ppid, SIGUSR1);
% 
% +	if (filetype != FT_FIFO)
% +		close (fd);
% +	else {
% +		usleep(1);
% +		while (state != 7)
% +			;
% +		close(fd);
% +		kill(ppid, SIGUSR1);
% +	}
% +
%  	exit(0);
%  }
% @@ -202,4 +223,6 @@
%  parent(int fd)
%  {
% +	struct pollfd pfd;
% +
%  	usleep(1);
%  	while (state != 1)
% @@ -210,4 +233,10 @@
%  			err(1, "open for write");
%  	}
% +	pfd.fd = fd;
% +	pfd.events = POLLIN | POLLOUT;
% +
% +	if (poll(&pfd, 1, 0) < 0)
% +		err(1, "poll");
% +	report(-1, "1p", POLLOUT, pfd.revents);
%  	kill(cpid, SIGUSR1);
% 
% @@ -253,4 +282,19 @@
%  	while (state != 7)
%  		;
% +	fd = open(FIFONAME, O_WRONLY | O_NONBLOCK);
% +	if (fd < 0)
% +		err(1, "open for write");
% +	pfd.fd = fd;
% +	if (poll(&pfd, 1, 0) < 0)
% +		err(1, "poll");
% +	report(-1, "7", POLLOUT, pfd.revents);
% +	kill(cpid, SIGUSR1);
% +
% +	usleep(1);
% +	while (state != 8)
% +		;
% +	if (poll(&pfd, 1, 0) < 0)
% +		err(1, "poll");
% +	report(-1, "8", POLLHUP, pfd.revents);
%  }
%

Bruce