problems with mmap() and disk caching
Konstantin Belousov
kostikbel at gmail.com
Fri Apr 6 08:39:03 UTC 2012
On Thu, Apr 05, 2012 at 01:25:49PM -0500, Alan Cox wrote:
> On 04/05/2012 12:31, Konstantin Belousov wrote:
> >On Thu, Apr 05, 2012 at 10:54:31AM -0500, Alan Cox wrote:
> >>On 04/04/2012 02:17, Konstantin Belousov wrote:
> >>>On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:
> >>>>Hi,
> >>>>
> >>>>I open the file, then call mmap() on the whole file and get pointer,
> >>>>then I work with this pointer. I expect that page should be only once
> >>>>touched to get it into the memory (disk cache?), but this doesn't work!
> >>>>
> >>>>I wrote the test (attached) and ran it for the 1G file generated from
> >>>>/dev/random, the result is the following:
> >>>>
> >>>>Prepare file:
> >>>># swapoff -a
> >>>># newfs /dev/ada0b
> >>>># mount /dev/ada0b /mnt
> >>>># dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024
> >>>>
> >>>>Purge cache:
> >>>># umount /mnt
> >>>># mount /dev/ada0b /mnt
> >>>>
> >>>>Run test:
> >>>>$ ./mmap /mnt/random-1024 30
> >>>>mmap: 1 pass took: 7.431046 (none: 262112; res: 32; super:
> >>>>0; other: 0)
> >>>>mmap: 2 pass took: 7.356670 (none: 261648; res: 496; super:
> >>>>0; other: 0)
> >>>>mmap: 3 pass took: 7.307094 (none: 260521; res: 1623; super:
> >>>>0; other: 0)
> >>>>mmap: 4 pass took: 7.350239 (none: 258904; res: 3240; super:
> >>>>0; other: 0)
> >>>>mmap: 5 pass took: 7.392480 (none: 257286; res: 4858; super:
> >>>>0; other: 0)
> >>>>mmap: 6 pass took: 7.292069 (none: 255584; res: 6560; super:
> >>>>0; other: 0)
> >>>>mmap: 7 pass took: 7.048980 (none: 251142; res: 11002; super:
> >>>>0; other: 0)
> >>>>mmap: 8 pass took: 6.899387 (none: 247584; res: 14560; super:
> >>>>0; other: 0)
> >>>>mmap: 9 pass took: 7.190579 (none: 242992; res: 19152; super:
> >>>>0; other: 0)
> >>>>mmap: 10 pass took: 6.915482 (none: 239308; res: 22836; super:
> >>>>0; other: 0)
> >>>>mmap: 11 pass took: 6.565909 (none: 232835; res: 29309; super:
> >>>>0; other: 0)
> >>>>mmap: 12 pass took: 6.423945 (none: 226160; res: 35984; super:
> >>>>0; other: 0)
> >>>>mmap: 13 pass took: 6.315385 (none: 208555; res: 53589; super:
> >>>>0; other: 0)
> >>>>mmap: 14 pass took: 6.760780 (none: 192805; res: 69339; super:
> >>>>0; other: 0)
> >>>>mmap: 15 pass took: 5.721513 (none: 174497; res: 87647; super:
> >>>>0; other: 0)
> >>>>mmap: 16 pass took: 5.004424 (none: 155938; res: 106206; super:
> >>>>0; other: 0)
> >>>>mmap: 17 pass took: 4.224926 (none: 135639; res: 126505; super:
> >>>>0; other: 0)
> >>>>mmap: 18 pass took: 3.749608 (none: 117952; res: 144192; super:
> >>>>0; other: 0)
> >>>>mmap: 19 pass took: 3.398084 (none: 99066; res: 163078; super:
> >>>>0; other: 0)
> >>>>mmap: 20 pass took: 3.029557 (none: 74994; res: 187150; super:
> >>>>0; other: 0)
> >>>>mmap: 21 pass took: 2.379430 (none: 55231; res: 206913; super:
> >>>>0; other: 0)
> >>>>mmap: 22 pass took: 2.046521 (none: 40786; res: 221358; super:
> >>>>0; other: 0)
> >>>>mmap: 23 pass took: 1.152797 (none: 30311; res: 231833; super:
> >>>>0; other: 0)
> >>>>mmap: 24 pass took: 0.972617 (none: 16196; res: 245948; super:
> >>>>0; other: 0)
> >>>>mmap: 25 pass took: 0.577515 (none: 8286; res: 253858; super:
> >>>>0; other: 0)
> >>>>mmap: 26 pass took: 0.380738 (none: 3712; res: 258432; super:
> >>>>0; other: 0)
> >>>>mmap: 27 pass took: 0.253583 (none: 1193; res: 260951; super:
> >>>>0; other: 0)
> >>>>mmap: 28 pass took: 0.157508 (none: 0; res: 262144; super:
> >>>>0; other: 0)
> >>>>mmap: 29 pass took: 0.156169 (none: 0; res: 262144; super:
> >>>>0; other: 0)
> >>>>mmap: 30 pass took: 0.156550 (none: 0; res: 262144; super:
> >>>>0; other: 0)
> >>>>
> >>>>If I ran this:
> >>>>$ cat /mnt/random-1024> /dev/null
> >>>>before test, when result is the following:
> >>>>
> >>>>$ ./mmap /mnt/random-1024 5
> >>>>mmap: 1 pass took: 0.337657 (none: 0; res: 262144; super:
> >>>>0; other: 0)
> >>>>mmap: 2 pass took: 0.186137 (none: 0; res: 262144; super:
> >>>>0; other: 0)
> >>>>mmap: 3 pass took: 0.186132 (none: 0; res: 262144; super:
> >>>>0; other: 0)
> >>>>mmap: 4 pass took: 0.186535 (none: 0; res: 262144; super:
> >>>>0; other: 0)
> >>>>mmap: 5 pass took: 0.190353 (none: 0; res: 262144; super:
> >>>>0; other: 0)
> >>>>
> >>>>This is what I expect. But why this doesn't work without reading file
> >>>>manually?
> >>>Issue seems to be in some change of the behaviour of the reserv or
> >>>phys allocator. I Cc:ed Alan.
> >>I'm pretty sure that the behavior here hasn't significantly changed in
> >>about twelve years. Otherwise, I agree with your analysis.
> >>
> >>On more than one occasion, I've been tempted to change:
> >>
> >> pmap_remove_all(mt);
> >> if (mt->dirty != 0)
> >> vm_page_deactivate(mt);
> >> else
> >> vm_page_cache(mt);
> >>
> >>to:
> >>
> >> vm_page_dontneed(mt);
> >>
> >>because I suspect that the current code does more harm than good. In
> >>theory, it saves activations of the page daemon. However, more often
> >>than not, I suspect that we are spending more on page reactivations than
> >>we are saving on page daemon activations. The sequential access
> >>detection heuristic is just too easily triggered. For example, I've
> >>seen it triggered by demand paging of the gcc text segment. Also, I
> >>think that pmap_remove_all() and especially vm_page_cache() are too
> >>severe for a detection heuristic that is so easily triggered.
> >Yes, I agree that such change shall be an improvement, and I expect
> >that Andrey will test it.
> >
> >On the other hand, I do think that allocator should prefer unnamed
> >pages to pages which still have valid content. On my 12G desktop,
> >I never saw more then 100MB of cached pages, and similar numbers
> >are observed on the 32-48GB servers. I suppose that this is related.
>
> On allocation, the system does prefer free pages over cached pages.
> When cached pages are added to the physical memory allocator, they are
> added to VM_FREEPOOL_CACHE. When pages are allocated, they are taken
> from VM_FREEPOOL_DEFAULT. Generally, pages only move from the CACHE
> pool to the DEFAULT pool when the DEFAULT pool is depleted. (However,
> occasionally, they do move because of coalescing.) When I redid the
> physical memory allocator, I looked at the rate of cached page
> reactivation under the old and the new allocators. At least for the
> tests that I did the rates weren't that different. It was low,
> single-digit percentages. I think the highest likelihood of
> reactivation comes from the pages that are cached by the sequential
> access heuristic because it is so overzealous.
>
> I don't think it's related. You see modest numbers of cached pages
> simply because the page daemon met its target for the sum of free and
> cached pages. So, it just stopped moving pages from the inactive queue
> into the physical memory allocator's cache/free queues.
No, I mean something else. Specifically, I mean that somehow the
preference for non-named pages does not work. At least, I cannot give
any other explanation for the following experiment.
Lets take stock HEAD without change in vm_fault.c. The initial
state of 8GB machine is as follows, the test file was not even
stat(2)-ed yet.
Mem: 37M Active, 18M Inact, 150M Wired, 236K Cache, 27M Buf, 7612M Free
Now, run the unmodified original Andrey' test with only one pass,
making sequential read of the mmap of a 5GB file from UFS volume.
After the run
Mem: 38M Active, 18M Inact, 153M Wired, 21M Cache, 30M Buf, 7586M Free
Please note that cached count increased only for 20M, and this is
for calls to vm_page_cache() worth of 5GB. In other words, it seems
that allocator almost never touches free memory, always preferring
cache. This is mostly coincides with what I saw when I profiled
original problem reported by Andrey.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20120406/1f88dd3f/attachment.pgp
More information about the freebsd-hackers
mailing list