Pipes, cat buffer size

Sat Oct 18 23:12:04 UTC 2008

In the last episode (Oct 19), Ivan Voras said:
> Dan Nelson wrote:
> > In the last episode (Oct 18), Ivan Voras said:
> >> I'm working on a program that's intended to be used as a "filter",
> >> as in "something | myprogram > file". I'm trying it with cat and
> >> I'm seeing my read()s return small blocks, 64 kB in size. I
> >> suppose this is because cat writes in 64 kB blocks. So:
> >>
> >> a) Is there a way to programatically, per-process, set the pipe buffer
> >> size? The program in question is a compressor and it's particularly
> >> inefficient when given small blocks and I'm wondering if the system can
> >> buffer enough data for it.
> > 
> > Why not keep reading until you reach your desired compression block
> > size?  Bzip2's default blocksize is 900k, for example.
> 
> Of course. But that's not the point :) From what I see (didn't look at
> the code), Linux for example does some kind of internal buffering that
> decouples how the reader and the writer interact. I think that with
> FreeBSD's current behaviour the writer could write 1-byte buffers and
> the reader will be forced to read each byte individually. I don't know
> if there's some ulterior reason for this.

No; take a look at /sys/kern/sys_pipe.c .  Depending on how much data
is in the pipe, it switches between async in-kernel buffering (<8192
bytes), and direct page wiring between sender and receiver (basically
zero-copy).

> >> b) Is there any objection to the following patch to cat:
> > 
> > It might be simpler to just use "dd if=myfile obs=1m" instead of
> > patching cat.
> 
> I believe patching cat to bring its block size into the century of the
> fruitbat has its own benefits.

-- 
	Dan Nelson
	dnelson at allantgroup.com