read vs. mmap (or io vs. page faults)

Dan Nelson dnelson at
Sun Jun 20 15:41:52 GMT 2004

In the last episode (Jun 20), Mikhail Teterin said:
> I expected the second way to be faster, as it is supposed to avoid
> one memory copying (no user-space buffer). But in reality, on a
> CPU-bound (rather than IO-bound) machine, using mmap() is
> considerably slower. Here are the tcsh's time results:
> 	Single Pentium2-400MHz running 4.8-stable:
> 	------------------------------------------
> stdio: 56.837u 34.115s 2:06.61 71.8%   66+193k 11253+0io 3pf+0w
> mmap:  72.463u  7.534s 2:34.62 51.7%   5+186k  105+0io   22328pf+0w
> 	Dual Pentium2 Xeon 450MHz running recent -current:
> 	--------------------------------------------------
> stdio: 36.557u 29.395s 3:09.88 34.7%   10+165k 32646+0io 0pf+0w
> mmap:  42.052u  7.545s 2:02.25 40.5%   10+169k 16+0io    15232pf+0w
> On the IO-bound machine, using mmap is only marginally faster:
> 	Single Pentium4M (Centrino 1GHz) runing recent -current:
> 	--------------------------------------------------------
> stdio: 27.195u 8.280s 1:33.02 38.1%    10+169k 11221+0io 1pf+0w
> mmap:  26.619u 3.004s 1:23.59 35.4%    10+169k 47+0io    19463pf+0w
> Notice the last two columns in time's output -- why is page-faulting a
> page in -- on-demand -- so much slower then read()-ing it? I even tried
> inserting ``madvise(buffer, file_size, MADV_SEQUENTIAL)'' between the
> mmap() and the process() -- made difference at all (or made the mmap()
> take slightly longer)...

MADV_SEQUENTIAL just lets the system expire already-read blocks from
its cache faster, so it won't help much here.  read() should cause some
prefetching to occur, but it obviously doesn't work all the time or
else inblock wouldn't have been as high as 11000.  For sequential
access I would have expected read() to have been able to prefetch
almost every block before the userland process needed it.

	Dan Nelson
	dnelson at

More information about the freebsd-current mailing list