How does disk caching work?
Uwe Doering
gemini at geminix.org
Mon Apr 19 23:17:12 PDT 2004
Igor Shmukler wrote:
>>>Sorry, I shouldn't have been lazy and actually looked up the settings.
>>>Yes, those are the settings I was reffering to. Someone else had cranked
>>>them up so that the machine was maintaining about 1.7G in cache; he said
>>>that he'd noticed a reduction in disk IO when he did that. I haven't
>>>been able to see any difference in disk IO, though it seems logical that
>>>setting cache too high would hurt write caching and actually increase
>>>disk IO. It's currently set to whatever the kernel thought best, so I'll
>>>just leave it there.
>>
>>Well, I'm afraid your colleague must have been imagining things. The
>>cache queue ('Cache' column in 'top') is just a phase in the laundering
>>procedure (VM page recyling) between the inactive queue ('Inact' in
>>'top') and the free queue ('Free' in 'top'). So these variables have
>>nothing to do with disk i/o performance.
>
> I am not sure you are correct here. I understand things very differently.
> Why it is a fact that number of pages in the cache queue does not affect IO throughput, changing vm setting such as:
> vm.stats.vm.v_cache_min, vm.stats.vm.v_cache_max, vm.stats.vm.v_free_target and vm.stats.vm.v_free_min should have an effect on disk IO.
>
> The very reason JD came up with cache pages is to minimize IO traffic. If we require lagrer number of free pages we cause OS remove references at earlier point. This should cause kernel re-read some of the pages that otherwise would be just requeued to active queue.
>
> Having larger cache queue would require VM to start cleaning dirty pages earlier, which results in some additional write traffic as well. However, this is not that bad, because here it is a zero sum game. If pages to become free, they would have to written out regardless of cache queue size, just at a later point. However there is a benefit to a larger cache bucket. The upside is that if machine often experiences burst in memory demand (pretty much any real-world server would), you are able to accamodate changing load without blocking.
Well, I didn't claim that the cache queue were useless. It does have
its merits. And there is a certain default amount configured by the
kernel's auto-scaling code already.
What I was trying to point out is that these variables don't necessarily
do what their name suggests. Take 'vm.v_cache_max', for example. When
you crank that up, instead of increasing the size of the cache queue it
is actually the inactive queue that grows in size.
This is because the kernel steals pages from the inactive queue when it
temporarily runs out of pages in the cache queue, without having to
block for i/o as long as there are clean (not written to or already
laundered) pages in the inactive queue. When it finds dirty pages
during this scan it schedules them for background synchronization with
the disk, but again without blocking in the foreground.
The reason for this algorithm is that it is better to keep pages in the
inactive queue for as long as possibe, rather than moving them over to
the cache queue prematurely. Pages in the inactive queue can be still
mapped into the memory space of processes, while pages in the cache
queue have lost this association. So, quite naturally, when the VM
system has to reactivate a page (put it back into the active queue) this
operation tends to be less expensive when the page is still in the
inactive queue.
So, for reasons like these, I keep recommending to either study the
kernel sources before you try to tune the VM system, or leave these
variables alone.
Uwe
--
Uwe Doering | EscapeBox - Managed On-Demand UNIX Servers
gemini at geminix.org | http://www.escapebox.net
More information about the freebsd-performance
mailing list