pipe/fifo code merged.

Mon Jan 9 14:34:31 UTC 2012

On Sun, 8 Jan 2012, Giovanni Trematerra wrote:

> Hi,
> the patch at
> http://www.trematerra.net/patches/pipefifo_merge2.diff
>
> is a preliminary version of the FIFO optimizations project that I picked up from
> the wiki.
> http://wiki.freebsd.org/IdeasPage#FIFO_optimizations_.28GSoC.29

I would go the other way, and pessimize pipes to be like fifos.  Then
optimize the socket layer under both.  Fifos are not important, but
they are implemented on top of the socket layer which is important.
Pipes are important.  In 4.4BSD, pipes were implemented on top of the
socket layer too.  This was much simpler than for fifos -- pipe() was
just a wrapper that took a whole 44 lines, while fifofs took 602 lines.
Now, fifofs still only takes 753 lines, but sys_pipe.c takes 1671
lines.  pipe() is similar to socketpair(), but even simpler.  socketpair()
took 62 lines in 4.4BSD.  It still takes only 81 lines (the extras are
mainly for splitting it into sys_socketpair() and kern_socketpair()).
The pipe optimizations in FreeBSD originated in 1996.  They are good
locally, but may have inhibited more useful optimizations in the socket
layer.

For the socket layer, there is the ZERO_COPY_SOCKETS options.  This
gives optimizations related to the ones for pipes.  I have no experience
with it.  It seems to be only for hardware sockets.  It is apparently
not very popular or well maintained, since it isn't an any GENERIC.

The socket layer provides some fancy ioctls that might be useful and
even work for anything implemented on top of sockets.  The ones for
controlling socket buffer sizes and watermarks are most interesting.
I don't know if the fifo wrapper does anything to prevent passing these
to the socket layer.  For pipes, there are no fancy ioctls.  The pipe
code uses heuristics and thre hard-coded value PIPE_MINDIRECT to
decide whether it should try to optimize for small writes or large
writes.  These mostly work, but don't provide as much control as the
socket ioctls.  I once did a lot of benchmarking of FreeBSD pipe i/o
vs Linux pipe i/o.  Linux is much faster for small blocks and FreeBSD
is much faster for large blocks provided they are not so large as to
bust caches.  This is because although the FreeBSD options for direct
writes work, they have large overheads, and FreeBSD has much larger
overheads generally.  If the application could control the mode, then
the overheads could be reduced by switching to completely different
code (and if you want socket ioctls, even to the socket code).  But
this would be very complicated.

Linux-2.6.10 implements fifos as a small wrapper around pipes, while
FreeBSD implements them as a large wrapper around sockets.  I hope the
former is what you do -- share most pipe code, without making it more
complicated, and with making the fifo wrapper much simpler.  The Linux
code is much simpler and smaller, since for pipes it it doesn't
implement direct mode, and for sockets it doesn't have to interact with
the complicated socket layer.

Bruce