Reading via mmap stinks (Re: weird bugs with mmap-ing via NFS)
Peter Jeremy
peterjeremy at optushome.com.au
Sat Mar 25 20:14:01 UTC 2006
On Sat, 2006-Mar-25 10:29:17 -0800, Matthew Dillon wrote:
> Really odd. Note that if your disk can only do 25 MBytes/sec, the
> calculation is: 2052167894 / 25MB = ~80 seconds, not ~60 seconds
> as you would expect from your numbers.
systat was reporting 25-26 MB/sec. dd'ing the underlying partition gives
27MB/sec (with 24 and 28 for adjacent partions).
> This type of situation *IS* possible as a side effect of other
> heuristics. It is particularly possible when you combine read() with
> mmap because read() uses a different heuristic then mmap() to
> implement the read-ahead. There is also code in there which depresses
> the page priority of 'old' already-read pages in the sequential case.
> So, for example, if you do a linear grep of 2GB you might end up with
> a cache state that looks like this:
If I've understood you correctly, this also implies that the timing
depends on the previous two scans, not just the previous scan. I
didn't test all combinations of this but would have expected to see
two distinct sets of mmap/read timings - one for read/mmap/read and
one for mmap/mmap/read.
> I need to change it to randomly retain swaths of pages, the
> idea being that it should take repeated runs to rebalance the VM cache
> rather then allowing a single run to blow it out or allowing a
> static set of pages to be retained indefinitely, which is what your
> tests seem to show is occuring.
I dont think this sort of test is a clear indication that something is
wrong. There's only one active process at any time and it's performing
a sequential read of a large dataset. In this case, evicting already
cached data to read new data is not necessarily productive (a simple-
minded algorithm will be evicting data this is going to be accessed in
the near future).
Based on the timings, mmap/read case manages to retain ~15% of the file
in cache. Given the amount of RAM available, the theoretical limit is
about 40% so this isn't too bad. It would be nicer if both read and
mmap managed this gain, irrespective of how the data had been previously
accessed.
--
Peter Jeremy
More information about the freebsd-stable
mailing list