kern/170203: [kern] piped dd's don't behave sanely when dealing
with a fifo
David Xu
listlog2011 at gmail.com
Fri Jul 27 09:00:23 UTC 2012
The following reply was made to PR kern/170203; it has been noted by GNATS.
From: David Xu <listlog2011 at gmail.com>
To: Bruce Evans <brde at optusnet.com.au>
Cc: Garrett Cooper <yanegomi at gmail.com>, freebsd-bugs at FreeBSD.org,
freebsd-gnats-submit at FreeBSD.org
Subject: Re: kern/170203: [kern] piped dd's don't behave sanely when dealing
with a fifo
Date: Fri, 27 Jul 2012 16:52:22 +0800
On 2012/7/27 10:07, Bruce Evans wrote:
> On Thu, 26 Jul 2012, Garrett Cooper wrote:
>
>>> Description:
>> Creating a fifo and then dd'ing across the fifo using /dev/zero
>> doesn't seem to yield the behavior one would expect to have; dd
>> should either exit thanks to SIGPIPE being sent or the count being
>> completed.
>>
>> Furthermore, the count is bogus:
>>
>> Terminal 1:
>>
>> $ dd if=fifo bs=512k count=4
>> 0+4 records in
>> 0+4 records out
>> 32768 bytes transferred in 0.002121 secs (15449523 bytes/sec)
>> $ dd if=fifo bs=512k count=4
>> 0+4 records in
>> 0+4 records out
>> 32768 bytes transferred in 0.001483 secs (22096295 bytes/sec)
>> ...
>
> I think it's working almost as expected. Large blocks give non-atomic
> I/O, so the reader sees small blocks, then EOF when it gets ahead of
> the writer. This always happens without SMP.
>
> Not is a bug (debugged below). There is no SIGPIPE at the start of
> write() because there is a reader then, and no SIGPIPE for the next
> write() because there is no next write() -- the current one doesn't
> notice when the reader goes away.
>
After fixed dd to not open fifo output file in O_RDWR mode, I still
found the
writer is blocked there even the reader is already exited.
I think this is definitely a bug. if reader is exited, the writer should
be aborted too,
but I found it still be blocked in state "pipedwt", obviously, the code in
/sys/fs/fifo_vnops.c wants to wake up the writer when the reader is
closing the fifo,
but it failed, because the bit flag PIPE_WANTW is forgotten to be set by
writer,
so it skips executing wakeup(), and then the writer has no chance to
find EOF bit flag
is set.
I have to apply the following two patches to make the bug go away:
http://people.freebsd.org/~davidxu/patch/fifopipe/kernel_pipe.diff
<http://people.freebsd.org/%7Edavidxu/patch/fifopipe/kernel_pipe.diff>
http://people.freebsd.org/~davidxu/patch/fifopipe/dd.diff
<http://people.freebsd.org/%7Edavidxu/patch/fifopipe/dd.diff>
> This is what happens under FreeBSD-~5.2 with the old fifo implementation,
> at least. It also shows a bug in truss(1) -- the current write() is not
> shown, because it hasn't returned. kdump shows that the write() has
> started but not returned.
>
>> $ dd if=fifo bs=512M count=4
>> 0+4 records in
>> 0+4 records out
>> 32768 bytes transferred in 0.003908 secs (8384514 bytes/sec)
>>
>> Terminal 2:
>>
>> $ dd if=/dev/zero bs=512k count=4 of=fifo
>> ^T
>> load: 0.40 cmd: dd 1779 [sbwait] 2.63r 0.00u 0.00s 0% 1800k
>
> FreeBSD-~5.2 shows [runnable] for the wait channel. This is
> strange. dd should be blocked waiting for a reader, and only
> sbwait makes sense for that. FreeBSD-9 apparently doesn't
> have the new named pipe implementation either. -current shows
> [pipdwt]. This makes it clearer that is waiting in write()
> and not in open(). dd probably does the wrong thing for
> fifos, by always trying to open files in O_RDWR mode first.
> This breaks the normal synchronization of readers and writers.
> In fact, this explains why there is no SIGPIPE -- there is
> always a reader since dd can always talk to itself. First
> the open succeeds without blocking as expected.
>
> After changing the O_RDWR to O_WRONLY in FreeBSD-~5.2, dd almost
> works as expected. The reader reads 4 blocks of size 8K and
> then exits. The writer first blocks in open. Then it is
> killed by SIGPIPE. Its SIGPIPE handling is broken (nonexistent),
> and the signal kills it without it printing a status message:
>
> % 1266 dd RET read 524288/0x80000
> % 1266 dd CALL write(0x4,0x8063000,0x80000)
> % 1266 dd RET write -1 errno 32 Broken pipe
> % 1266 dd PSIG SIGPIPE SIG_DFL
>
> The read is from /dev/zero. The write is of 512K to the fifo.
> This delivers 4*8K then is killed. If dd caught the signal
> like it should, then we would expect to see either a short
> write(). The signal handling should clear SA_RESTART, else
> the write() would be restarted and would deliver endless
> SIGPIPEs, now for failing writes. Reporting of short writes
> is quite broken and this is an interesting test for it.
>
> -current delivers 4*64K instead of 4*8K. This is because
> the i/o unit is BIG_PIPE_SIZE = 64K for nameless pipes and
> now for nameless pipes. Apparently the unit is 8K for
> sockets. I think the unit of atomicity is only 512 bytes
> for both. Certainly, PIPE_BUF is still 512 in limits.h.
> I think limits.h is broken since the unit isn't actually
> 512 bytes for _all_ file types. For sockets, you can control
> the watermarks and I think this changes the unit of atomicity.
> I wonder if the socket ioctls for this the old named pipe
> implemention.
>
> The pipe wait channel names are less than perfect. "pipdw"
> means "pipe direct write". "wt" looks like an abreviation
> for "write", but there are 3 waits in pipe_direct_write()
> and they are distinguished by the suffixes "w", "c" and "t".
> It isn't clear what these mean.
>
>>> How-To-Repeat:
>> mkfifo fifo
>>
>> Terminal 1:
>>
>> dd if=fifo bs=512k count=4
>>
>> Terminal 2:
>>
>> dd if=/dev/zero bs=512k count=4 of=fifo
>
> Remember to kill the writing dd if you stop it with ^Z. Otherwise, since
> the unhacked version is talking to itself, the fifo acts strangely for
> other tests.
>
> conv=block and conv=noerror (with cbs=512k) change the behaviour only
> slightly (slightly worse). What works easily is omitting the count.
> dd then reads until EOF, in 256 records of size exactly 8K each under
> FreeBSD-~5.2. Not giving the count is normal practice, since you
> rarely know the block size for pipes and many other file types. It
> there is another bug here, then it is conv=foo not working. But
> reblocking is confusing, and I probably did it wrong.
>
> ANother thing that doesn't work well here is trying to control the
> writer with SIGPIPE from the reader. Even if you can get the reblocking
> right and read precisily 2MB, and fix SIGPIPE, then the SIGPIPE may be
> delivered after the writer has dirtied the fifo with a little more than
> 2MB. The unread data then remains to bite the next reader.
>
> Bruce
> _______________________________________________
> freebsd-bugs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-bugs
> To unsubscribe, send any mail to "freebsd-bugs-unsubscribe at freebsd.org"
> .
>
More information about the freebsd-bugs
mailing list