read vs. mmap (or io vs. page faults)

Sun Jun 20 00:43:29 PDT 2004

Hello!

I'm writing a message-digest utility, which operates on file and
can use either stdio:

	while (not eof) {
		char buffer[BUFSIZE];
		size = read(.... buffer ...);
		process(buffer, size);
	}

or mmap:

	buffer = mmap(... file_size, PROT_READ ...);
	process(buffer, file_size);

I expected the second way to be faster, as it is supposed to avoid
one memory copying (no user-space buffer). But in reality, on a
CPU-bound (rather than IO-bound) machine, using mmap() is considerably
slower. Here are the tcsh's time results:

	Single Pentium2-400MHz running 4.8-stable:
	------------------------------------------
stdio:	56.837u 34.115s 2:06.61 71.8%   66+193k 11253+0io 3pf+0w
mmap:	72.463u 7.534s 2:34.62 51.7%    5+186k 105+0io 22328pf+0w

	Dual Pentium2 Xeon 450MHz running recent -current:
	--------------------------------------------------
stdio:	36.557u 29.395s 3:09.88 34.7%   10+165k 32646+0io 0pf+0w
mmap:	42.052u 7.545s 2:02.25 40.5%    10+169k 16+0io 15232pf+0w

On the IO-bound machine, using mmap is only marginally faster:

	Single Pentium4M (Centrino 1GHz) runing recent -current:
	--------------------------------------------------------
stdio:	27.195u 8.280s 1:33.02 38.1%    10+169k 11221+0io 1pf+0w
mmap:	26.619u 3.004s 1:23.59 35.4%    10+169k 47+0io 19463pf+0w

Notice the last two columns in time's output -- why is page-faulting a
page in -- on-demand -- so much slower then read()-ing it? I even tried
inserting ``madvise(buffer, file_size, MADV_SEQUENTIAL)'' between the
mmap() and the process() -- made difference at all (or made the mmap()
take slightly longer)...

I this how things are supposed to be, or will mmap() become more
efficient eventually? Thanks!

	-mi