Reading via mmap stinks (Re: weird bugs with mmap-ing via NFS)

Sat Mar 25 10:39:46 UTC 2006

On Fri, 2006-Mar-24 15:18:00 -0500, Mikhail Teterin wrote:
>which there is not with the read. Read also requires fairly large buffers in 
>the user space to be efficient -- *in addition* to the buffers in the kernel. 

I disagree.  With a filesystem read, the kernel is solely responsible
for handling physical I/O with an efficient buffer size.  The userland
buffers simply amortise the cost of the system call and copyout
overheads.

>I'm also quite certain, that fulfulling my "demands" would add quite a bit of 
>complexity to the mmap support in kernel, but hey, that's what the kernel is 
>there for :-)

Unfortunately, your patches to implement this seem to have become detached
from your e-mail. :-)

>Unlike grep, which seems to use only 32k buffers anyway (and does not use 
>madvise -- see attachment), my program mmaps gigabytes of the input file at 
>once, trusting the kernel to do a better job at reading the data in the most 
>efficient manner :-)

mmap can lend itself to cleaner implementatione because there's no
need to have a nested loop to read buffers and then process them.  You
can mmap then entire file and process it.  The downside is that on a
32-bit architecture, this limits you to processing files that are
somewhat less than 2GB.  The downside is that touching an uncached
page triggers a trap which may not be as efficient as reading a block
of data through the filesystem interface, and I/O errors are delivered
via signals (which may not be as easy to handle).

>Peter Jeremy wrote:
>> On an amd64 system running about 6-week old -stable, both ['grep' and 'grep 
>> --mmap' -mi] behave pretty much identically.
>
>Peter, I read grep's source -- it is not using madvise (because it hurts 
>performance on SunOS-4.1!) and reads in chunks of 32k anyway. Would you care 
>to look at my program instead? Thanks:
>
>	http://aldan.algebra.com/mzip.c

fetch: http://aldan.algebra.com/mzip.c: Not Found

I tried writing a program that just mmap'd my entire (2GB) test file
and summed all the longwords in it.  This gave me similar results to
grep.  Setting MADV_SEQUENTIAL and/or MADV_WILLNEED made no noticable
difference.  I suspect something about your code or system is disabling
the mmap read-ahead functionality.

What happens if you simulate read-ahead yourself?  Have your main
program fork and the child access pages slightly ahead of the parent
but do nothing else.

-- 
Peter Jeremy