Reading via mmap stinks (Re: weird bugs with mmap-ing via NFS)
Peter Jeremy
peterjeremy at optushome.com.au
Sat Mar 25 10:39:46 UTC 2006
On Fri, 2006-Mar-24 15:18:00 -0500, Mikhail Teterin wrote:
>which there is not with the read. Read also requires fairly large buffers in
>the user space to be efficient -- *in addition* to the buffers in the kernel.
I disagree. With a filesystem read, the kernel is solely responsible
for handling physical I/O with an efficient buffer size. The userland
buffers simply amortise the cost of the system call and copyout
overheads.
>I'm also quite certain, that fulfulling my "demands" would add quite a bit of
>complexity to the mmap support in kernel, but hey, that's what the kernel is
>there for :-)
Unfortunately, your patches to implement this seem to have become detached
from your e-mail. :-)
>Unlike grep, which seems to use only 32k buffers anyway (and does not use
>madvise -- see attachment), my program mmaps gigabytes of the input file at
>once, trusting the kernel to do a better job at reading the data in the most
>efficient manner :-)
mmap can lend itself to cleaner implementatione because there's no
need to have a nested loop to read buffers and then process them. You
can mmap then entire file and process it. The downside is that on a
32-bit architecture, this limits you to processing files that are
somewhat less than 2GB. The downside is that touching an uncached
page triggers a trap which may not be as efficient as reading a block
of data through the filesystem interface, and I/O errors are delivered
via signals (which may not be as easy to handle).
>Peter Jeremy wrote:
>> On an amd64 system running about 6-week old -stable, both ['grep' and 'grep
>> --mmap' -mi] behave pretty much identically.
>
>Peter, I read grep's source -- it is not using madvise (because it hurts
>performance on SunOS-4.1!) and reads in chunks of 32k anyway. Would you care
>to look at my program instead? Thanks:
>
> http://aldan.algebra.com/mzip.c
fetch: http://aldan.algebra.com/mzip.c: Not Found
I tried writing a program that just mmap'd my entire (2GB) test file
and summed all the longwords in it. This gave me similar results to
grep. Setting MADV_SEQUENTIAL and/or MADV_WILLNEED made no noticable
difference. I suspect something about your code or system is disabling
the mmap read-ahead functionality.
What happens if you simulate read-ahead yourself? Have your main
program fork and the child access pages slightly ahead of the parent
but do nothing else.
--
Peter Jeremy
More information about the freebsd-stable
mailing list