Pipes, cat buffer size

Sun Oct 19 00:50:24 UTC 2008

In the last episode (Oct 19), Ivan Voras said:
> 2008/10/19 Dan Nelson <dnelson at allantgroup.com>:
> > In the last episode (Oct 19), Ivan Voras said:
> >> Of course. But that's not the point :) From what I see (didn't
> >> look at the code), Linux for example does some kind of internal
> >> buffering that decouples how the reader and the writer interact. I
> >> think that with FreeBSD's current behaviour the writer could write
> >> 1-byte buffers and the reader will be forced to read each byte
> >> individually. I don't know if there's some ulterior reason for
> >> this.
> >
> > No; take a look at /sys/kern/sys_pipe.c .  Depending on how much
> > data is in the pipe, it switches between async in-kernel buffering
> > (<8192 bytes), and direct page wiring between sender and receiver
> > (basically zero-copy).
> 
> Ok, maybe it's just not behaving as I thought it should. See this
> test program:

[ program that prints the amount of data in each read() ]

> and this command line:
> 
> > dd bs=1 if=/dev/zero| ./reader
> 
> The output of this on RELENG_7 is:
> 
> read 8764 bytes
> read 1 bytes
[..]
> read 1 bytes
> read 1 bytes
> ...
> 
> The first value puzzles me - so it actually is doing some kind of
> buffering. Linux isn't actually much better, but the intention is
> there:
> 
> $ dd if=/dev/zero bs=1 | ./bla
> read 1 bytes
> read 38 bytes
> read 8 bytes
> read 2 bytes
[..]
> read 2 bytes
> read 3 bytes
> read 3 bytes
> read 112 bytes
> read 2 bytes
> read 2 bytes
> ...
> 
> Maybe FreeBSD switches between the writer and the reader too soon so
> the buffer doesn't get filled?

If your reader isn't doing any real work between reads, it is always
reading, so the pipe will never fill up.  The delay in FreeBSD was
probably due to the shell spawning the writer first, so it buffered up
8k of data before the reader was ready.  After that, the reader was
able to pull data as fast as the writer pushed.

> Using cat (which started all this), FreeBSD consistently processes
> 4096 byte buffers, while Linux's sizes are all over the place - from
> 4 kB to 1 MB, randomly fluctuating. My goal would be (if it's
> possible - it might not be) to maximize coalescing in an environment
> where the reader does something with the data (e.g. compression) so
> there should be a reasonable amount of backlogged input data.

Remember that increasing coelescing also increases latency and
decreases the parallelism between reader and writer (since if you
coalesce you cause the reader to wait for data that's already been
writen, in the hopes that the writer will write again soon).

> But if it works in general, it may simply be that it isn't really
> applicable to my purpose (and I should modify the reader to read
> multiple blocks).

That's my suggestion, yes.  That way your program would also work when
passed data from an internet socket (where you will get varying read()
sizes too).  It wouldn't add more than 10 lines to wrap your read in a
loop that exits when your preferred size has been reached.

> Though it won't help me, I still think that modifying cat is worth it :)

-- 
	Dan Nelson
	dnelson at allantgroup.com