read vs. mmap (or io vs. page faults)

Mon Jun 21 22:26:51 GMT 2004

In the last episode (Jun 21), Mikhail Teterin said:
> > Both read and mmap have a read-ahead heuristic. The heuristic
> > works. In fact, the mmap heuristic is so smart it can read-behind
> > as well as read-ahead if it detects a backwards scan.
> 
> Evidently, read's heuristics are better. At least, for this task.
> I'm, actually, surprised, they are _different_ at all.
[...] 
> That other OSes have similar shortcomings simply gives us some
> breathing room from an advocacy point of view. I hope, my rhetoric
> will burn an itch in someone capable of addressing it technically :-)
> 
> > The heuristic does not try to read megabytes and megabytes ahead,
> > however...
> 
> Neither does the read-handling.

I think part of the problem is that it's just clustering reads instead
of making sure the next N blocks of data are prefetched.  So you may
ask for 8k, but the system will fetch the next 64k of data.  Problem is
the system does nothing until you read the next 8k past the 64k
alreqady read in, then it jumps up and grabs the next 64k.  You're
still waiting on I/O every 8th read.  Ideally it would do an async
fetch of a 8k block (64k ahead of the current read) every time you read
a block.  It should be a lot easier for read to do this, since the
kernel is getting a steady stream of syscalls.  Once a 64k chunk of
mmapped address space is pulled in, the system isn't notified until the
next page fault.  (or am I misunderstanding how readahead is
implemented on mmapped data?)

-- 
	Dan Nelson
	dnelson at allantgroup.com