read vs. mmap (or io vs. page faults)
mi+kde at aldan.algebra.com
Wed Jun 23 06:42:18 GMT 2004
On Tuesday 22 June 2004 11:27 pm, Peter Wemm wrote:
= mmap is more valuable as a programmer convenience these days. Don't
= make the mistake of assuming its faster, especially since the cost of
= a copy has gone way down.
Actually, let me back off from agreeing with you here :-) On io-bound
machines (such as my laptop), there is no discernable difference in
either the CPU or the elapsed time -- md5-ing a file with mmap or read
is (curiously) slightly faster than just cat-ing it into /dev/null.
On an dual P2 450MHz, the single process always wins the CPU time and
sometimes the elapsed time. Sometimes it wins handsomly:
mmap: 35.271u 4.004s 1:06.08 59.4% 10+190k 0+0io 4185pf+0w
read: 32.134u 15.797s 1:58.72 40.3% 408+302k 11228+0io 12pf+0w
mmap: 35.039u 4.558s 1:10.27 56.3% 10+190k 5+0io 5028pf+0w
read: 29.931u 27.848s 2:07.17 45.4% 10+187k 11219+0io 5pf+0w
Mind you, both of the two processors are Xeons with _2Mb of cache on
each_, so memory copying should be even cheaper on them than usual. And
yet mmap manages to win...
On a single P2 400MHz (standard 521Kb cache) mmap always wins the CPU
time, and, thanks to that, can win the elapsed time on a busy system.
For example, running two of these processes in parallel (on two separate
copies of the same huge file residing on distinct disks) yields (same
1462726660-byte file as in the dual Xeon stats above):
mmap: 66.989u 7.584s 3:01.76 41.0% 5+238k 90+0io 22456pf+0w
65.474u 7.729s 2:38.59 46.1% 5+241k 90+0io 22401pf+0w
read: 60.724u 42.394s 3:37.01 47.5% 5+241k 22541+0io 0pf+0w
61.778u 41.987s 3:35.36 48.1% 5+239k 11256+0io 0pf+0w
That's 182 vs. 215 seconds, or 15% elapsed time win for mmap. Evidently,
mmap runs through that "nasty nasty code" faster than read runs through
its. mmap loses on an idle system, I presume, because page-faulting is
not smart enough to page-fault ahead as efficiently as read pre-reads
Why am I complaining then? Because I want the "nasty nasty code"
improved so that using mmap is beneficial for the single process too.
Thank you very much! Yours,
More information about the freebsd-current