kern/170203: [kern] piped dd's don't behave sanely when dealing with a fifo

Bruce Evans brde at optusnet.com.au
Fri Jul 27 02:10:04 UTC 2012


The following reply was made to PR kern/170203; it has been noted by GNATS.

From: Bruce Evans <brde at optusnet.com.au>
To: Garrett Cooper <yanegomi at gmail.com>
Cc: freebsd-gnats-submit at FreeBSD.org, freebsd-bugs at FreeBSD.org
Subject: Re: kern/170203: [kern] piped dd's don't behave sanely when dealing
 with a fifo
Date: Fri, 27 Jul 2012 12:07:07 +1000 (EST)

 On Thu, 26 Jul 2012, Garrett Cooper wrote:
 
 >> Description:
 > Creating a fifo and then dd'ing across the fifo using /dev/zero doesn't seem to yield the behavior one would expect to have; dd should either exit thanks to SIGPIPE being sent or the count being completed.
 >
 > Furthermore, the count is bogus:
 >
 > Terminal 1:
 >
 > $ dd if=fifo bs=512k count=4
 > 0+4 records in
 > 0+4 records out
 > 32768 bytes transferred in 0.002121 secs (15449523 bytes/sec)
 > $ dd if=fifo bs=512k count=4
 > 0+4 records in
 > 0+4 records out
 > 32768 bytes transferred in 0.001483 secs (22096295 bytes/sec)
 > ...
 
 I think it's working almost as expected.  Large blocks give non-atomic
 I/O, so the reader sees small blocks, then EOF when it gets ahead of
 the writer.  This always happens without SMP.
 
 Not is a bug (debugged below).  There is no SIGPIPE at the start of
 write() because there is a reader then, and no SIGPIPE for the next
 write() because there is no next write() -- the current one doesn't
 notice when the reader goes away.
 
 This is what happens under FreeBSD-~5.2 with the old fifo implementation,
 at least.  It also shows a bug in truss(1) -- the current write() is not
 shown, because it hasn't returned.  kdump shows that the write() has
 started but not returned.
 
 > $ dd if=fifo bs=512M count=4
 > 0+4 records in
 > 0+4 records out
 > 32768 bytes transferred in 0.003908 secs (8384514 bytes/sec)
 >
 > Terminal 2:
 >
 > $ dd if=/dev/zero bs=512k count=4 of=fifo
 > ^T
 > load: 0.40  cmd: dd 1779 [sbwait] 2.63r 0.00u 0.00s 0% 1800k
 
 FreeBSD-~5.2 shows [runnable] for the wait channel.  This is
 strange.  dd should be blocked waiting for a reader, and only
 sbwait makes sense for that.  FreeBSD-9 apparently doesn't
 have the new named pipe implementation either.  -current shows
 [pipdwt].  This makes it clearer that is waiting in write()
 and not in open().  dd probably does the wrong thing for
 fifos, by always trying to open files in O_RDWR mode first.
 This breaks the normal synchronization of readers and writers.
 In fact, this explains why there is no SIGPIPE -- there is
 always a reader since dd can always talk to itself.  First
 the open succeeds without blocking as expected.
 
 After changing the O_RDWR to O_WRONLY in FreeBSD-~5.2, dd almost
 works as expected.  The reader reads 4 blocks of size 8K and
 then exits.  The writer first blocks in open.  Then it is
 killed by SIGPIPE.  Its SIGPIPE handling is broken (nonexistent),
 and the signal kills it without it printing a status message:
 
 %   1266 dd       RET   read 524288/0x80000
 %   1266 dd       CALL  write(0x4,0x8063000,0x80000)
 %   1266 dd       RET   write -1 errno 32 Broken pipe
 %   1266 dd       PSIG  SIGPIPE SIG_DFL
 
 The read is from /dev/zero.  The write is of 512K to the fifo.
 This delivers 4*8K then is killed.  If dd caught the signal
 like it should, then we would expect to see either a short
 write().  The signal handling should clear SA_RESTART, else
 the write() would be restarted and would deliver endless
 SIGPIPEs, now for failing writes.  Reporting of short writes
 is quite broken and this is an interesting test for it.
 
 -current delivers 4*64K instead of 4*8K.  This is because
 the i/o unit is BIG_PIPE_SIZE = 64K for nameless pipes and
 now for nameless pipes.  Apparently the unit is 8K for
 sockets.  I think the unit of atomicity is only 512 bytes
 for both.  Certainly, PIPE_BUF is still 512 in limits.h.
 I think limits.h is broken since the unit isn't actually
 512 bytes for _all_ file types.  For sockets, you can control
 the watermarks and I think this changes the unit of atomicity.
 I wonder if the socket ioctls for this the old named pipe
 implemention.
 
 The pipe wait channel names are less than perfect.  "pipdw"
 means "pipe direct write".  "wt" looks like an abreviation
 for "write", but there are 3 waits in pipe_direct_write()
 and they are distinguished by the suffixes "w", "c" and "t".
 It isn't clear what these mean.
 
 >> How-To-Repeat:
 > mkfifo fifo
 >
 > Terminal 1:
 >
 > dd if=fifo bs=512k count=4
 >
 > Terminal 2:
 >
 > dd if=/dev/zero bs=512k count=4 of=fifo
 
 Remember to kill the writing dd if you stop it with ^Z.  Otherwise, since
 the unhacked version is talking to itself, the fifo acts strangely for
 other tests.
 
 conv=block and conv=noerror (with cbs=512k) change the behaviour only
 slightly (slightly worse).  What works easily is omitting the count.
 dd then reads until EOF, in 256 records of size exactly 8K each under
 FreeBSD-~5.2.  Not giving the count is normal practice, since you
 rarely know the block size for pipes and many other file types.  It
 there is another bug here, then it is conv=foo not working.  But
 reblocking is confusing, and I probably did it wrong.
 
 ANother thing that doesn't work well here is trying to control the
 writer with SIGPIPE from the reader.  Even if you can get the reblocking
 right and read precisily 2MB, and fix SIGPIPE, then the SIGPIPE may be
 delivered after the writer has dirtied the fifo with a little more than
 2MB.  The unread data then remains to bite the next reader.
 
 Bruce


More information about the freebsd-bugs mailing list