read vs. mmap (or io vs. page faults)

Tue Jun 22 20:27:40 PDT 2004

On Monday 21 June 2004 10:08 pm, Mikhail Teterin wrote:
> On Monday 21 June 2004 08:15 pm, Matthew Dillon wrote:
>
> = :The mmap interface is supposed to be more efficient -- theoreticly
> = :-- because it requires one less buffer-copying, and because it
> = :(together with the possible madvise()) provides the kernel with
> more = :information thus enabling it to make better (at least -- no
> worse) = :decisions.
>
> =     Well, I think you forgot my earlier explanation regarding
> buffer =     copying. = Buffer copying is a very cheap operation if
> it occurs =     within the L1 or L2 cache, and that is precisely what
> is happening =     when you read() int
>
> This could explain, why using mmap is not faster than read, but it
> does not explain, why it is slower.
>
> I'm afraid, your vast knowledge of the internals of the kernel
> workings obscure your vision. I, on the other hand, "enjoy" an almost
> total ignorance of it, and can see, that mmap interface _allows_ for
> a more (certainly, no _less_) efficient handling of the IO, than
> read. That the kernel is not using all the information passed to it,
> I can only explain by deficiencies/simplicity the implementation.

At the risk of propagating the thread, take a step back for a minute.

10-15 years ago, when mmap was first on the drawing boards as a concept 
for unix, the cost of a kernel trap and entering the vm system for 
fault recovery versus memory bandwidth is very very different compared 
to today.  Back then, getting into the kernel was relatively painless 
and memory was proportionally very slow and expensive to use.

However, these days, the memory subsystem is proportionally much much 
faster relative to the cost of kernel traps and vm processing and 
recovery.

The amount of  "work"  for the kernel to do a read() and a high-speed 
memory copy is much less than the cost of taking a page fault, running 
a whole bunch of really really nasty code in the vm system, repairing 
the damage from the page fault, updating the process paging state and 
restarting the instruction.

The numbers you're posting are a simple reflection of the fact that the 
read syscall path has fewer (and less expensive) instructions to 
execute compared to the mmap fault paths.

Some operating systems implemented read(2) as an internal in-kernel 
mmap/fault/copy/unmap.  Naturally, that made mmap look fast compared to 
read, at the time.  But that isn't how it is implemented in FreeBSD. 

mmap is more valuable as a programmer convenience these days.  Don't 
make the mistake of assuming its faster, especially since the cost of a 
copy has gone way down.  Also, dont assume that read() is faster for 
cases where you're reading a file that has been mmapped (and possibly 
even dirtied) in another process.

-- 
Peter Wemm - peter at wemm.org; peter at FreeBSD.org; peter at yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5