How does disk caching work?

Uwe Doering gemini at geminix.org
Mon Apr 19 09:25:51 PDT 2004


Jim C. Nasby wrote:
> On Mon, Apr 19, 2004 at 08:37:52AM +0200, Uwe Doering wrote:
> 
>>Jim C. Nasby wrote:
>>
>>>On Sat, Apr 17, 2004 at 09:41:19AM +0200, Uwe Doering wrote:
>>>[...]
>>>A few questions if I may...
>>>
>>>What's a good way to tune amount of space dedicated to IO buffers?
>>
>>You can tune the number of i/o buffers, and therefore indirectly the 
>>amount of memory they may allocate, by using the variable 'kern.nbuf' in 
>>'/boot/loader.conf'.  Note that this number gets multiplied by 16384 
>>(the default filesystem block size) to arrive at the amount of memory it 
>>results in.
>>
>>My experience is that with large amounts of RAM this area becomes 
>>unduely big, though.  It's not that you have to skimp on RAM in this 
>>enviroment, but the disk i/o buffers eat away at the KVM region (kernel 
>>virtual memory), which happens to be just 1 GB by default and doesn't 
>>grow with the RAM size.  So it can be a good idea to actually reduce the 
>>number of disk i/o buffers (compared to its auto-scaled default) on 
>>systems with plenty of RAM (since you don't need that many buffers, 
>>anyway, due to the VM interaction I just described) and save the 
>>available KVM rather for other purposes (kernel resources).  Systems 
>>that run out of KVM are prone to kernel panics, given the right 
>>combination of circumstances.
> 
> Yes, I was thinking the same thing. What I don't know is what would be a
> good value to use. dirtybuf in systat -v is typically less than 3000,
> which makes 261,000 buffer seem wasteful, but of course that's
> neglecting the read caching aspect.

With regard to the VM interaction I explained earlier, the same goes for 
read caching.  File and directory data is kept in VM objects attached to 
the internal vnodes (files etc.) once read in.  So large quantities of 
disk i/o buffers aren't needed for read caching, either.

We have 'kern.nbuf="4096"' for our production systems with 2 GB RAM, 
which results in 64 MB disk i/o cache.  These machines are used for 
server hosting purposes and therefore run all sorts of applications at 
the same time.  Look at the URL in my signature for more details.

I unfortunately don't know how much buffer space a dedicated database 
server would need, but I suspect that you won't notice any difference 
between 256 MB (default) and 64 MB (kern.nbuf="4096").

More interesting is probably to crank up 'vfs.hirunningspace' and 
'vfs.lorunningspace' (both 'sysctl' variables) in order to not stall 
write operations when there are plenty of outstanding read requests 
waiting for completion.  FreeBSD's classical bottleneck on disk i/o 
oriented servers.  In case your raid controller has a large i/o buffer 
of its own (16 MB or more) you may want to use values in this range:

   vfs.hirunningspace=8388608
   vfs.lorunningspace=6291456

These are in bytes.  Also, disable agressive read-ahead in the 
controller, if possible.  This feature is for MS Windows and would be 
counter-productive with FreeBSD.  FreeBSD knows better than the 
controller if and when to read ahead.

>>>What impact will vm_min|max_cache have on system performance? Is there
>>>any advantage to setting it fairly high?
>>
>>I'm not quite sure which variables you are referring to.  In FreeBSD 
>>there are 'vm.v_cache_min' and 'vm.v_cache_max'.  I don't recommend 
>>tuning them, though, without having a very deep and thorough look at the 
>>kernel sources.  Many of these variables don't really do what their name 
>>suggests, and there are interdependencies between some of them.  You can 
>>lock up your server by tuning them improperly.
> 
> Sorry, I shouldn't have been lazy and actually looked up the settings.
> Yes, those are the settings I was reffering to. Someone else had cranked
> them up so that the machine was maintaining about 1.7G in cache; he said
> that he'd noticed a reduction in disk IO when he did that. I haven't
> been able to see any difference in disk IO, though it seems logical that
> setting cache too high would hurt write caching and actually increase
> disk IO. It's currently set to whatever the kernel thought best, so I'll
> just leave it there.

Well, I'm afraid your colleague must have been imagining things.  The 
cache queue ('Cache' column in 'top') is just a phase in the laundering 
procedure (VM page recyling) between the inactive queue ('Inact' in 
'top') and the free queue ('Free' in 'top').  So these variables have 
nothing to do with disk i/o performance.

    Uwe
-- 
Uwe Doering         |  EscapeBox - Managed On-Demand UNIX Servers
gemini at geminix.org  |  http://www.escapebox.net


More information about the freebsd-performance mailing list