read vs. mmap (or io vs. page faults)

Matthew Dillon dillon at apollo.backplane.com
Sun Jun 20 17:13:03 PDT 2004


    Hmm.  Well, you can try calling madvise(... MADV_WILLNEED), that's what
    it is for.  

    It is usually a bad idea to try to populate the page table with all
    resident pages associated with the a memory mapping, because mmap()
    is often used to map huge files... hundreds of megabytes or even 
    dozens of gigabytes (on 64 bit architectures).  The last thing you want
    to do is to populate the page table for the entire file.  It might
    work for your particular program, but it is a bad idea for the OS to
    assume that for every mmap().

    What it comes down to, really, is whether you feel you actually need the
    additional performance, because it kinda sounds to me that whatever 
    processing you are doing to the data is either going to be I/O bound,
    or it isn't going to run long enough for the additional overhead to matter
    verses the processing overhead of the program itself.

    If you are really worried you could pre-fault the mmap before you do
    any processing at all and measure the time it takes to pre-fault the
    pages vs the time it takes to process the memory image.  (You pre-fault
    simply by accessing one byte of data in each page across the mmap(),
    before you begin any processing).

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>

:=     It's hard to say.  mmap() could certainly be made more efficient, e.g.
:=     by faulting in more pages at a time to reduce the actual fault rate.
:=     But it's fairly difficult to beat a read copy into a small buffer.
:
:Well, that's the thing -- by mmap-ing the whole file at once (and by
:madvise-ing with MADV_SEQUENTIONAL), I thought, I told, the kernel
:everything it needed to know to make the best decision. Why can't
:page-faulting code do a better job using all this knowledge, than the
:poor read, which only knows about the partial read in question?
:
:I find it so disappointing, that it can, probably, be considered a bug.
:I'll try this code on Linux and Solaris. If mmap is better there (as it
:really ought to be), we have a problem, IMHO. Thanks!
:
:	-mi
:


More information about the freebsd-questions mailing list