Manipulating disk cache (buf) settings

Mon May 23 18:18:25 PDT 2005

On Mon, 23 May 2005, John-Mark Gurney wrote:

> Sven Willenberger wrote this message on Mon, May 23, 2005 at 10:58 -0400:
>> We are running a PostgreSQL server (8.0.3) on a dual opteron system with
>> 8G of RAM. If I interpret top and vfs.hibufspace correctly (which show
>> values of 215MB and 225771520 (which equals 215MB) respectively. My
>> understanding from having searched the archives is that this is the
>> value that is used by the system/kernel in determining how much disk
>> data to cache.
>
> This is incorrect...  FreeBSD merged the vm and buf systems a while back,
> so all of memory is used as a disk cache..

Indeed.  Statistics utilities still haven't caught up with dyson's changes
in 1994 or 1995, so their display of statistics related to disk caching
is very misleading.  systat -v and top display vfs.bufspace but not
vfs.hibufspace.  Both of these are uninitersting.  vfs.bufspace gives the
amount of virtual memory that is currently allocated to the buffer cache.
vfs.hibufspace gives the maximum for this amount.  Virtual memory for
buffers is almost never released, so on active systems vfs.bufspace is
close to the maximum.  The maximum is just a compile-time constand
(BKVASIZE) times a boot-time constant (nbuf).

There is no way to tell from userland exactly how much of memory is used
for the vm part of the disk cache.  "inact" in systat -v gives a maximum.
Watch heavy file system for a while and you may see "inact" increase as
vm is used for disk data.  It decreases mainly when a file system is
unmounted.  Otherwise, it tends to stay near its maximum, with pages for
not recently used disk data being reused for something else (newer disk
data or processes).

> The buf cache is still used
> for filesystem meta data (and for pending writes of files, but those buf's
> reference the original page, not local storage)...

This is mostly incorrect.  The buffer cache is now little more than a
window on vm.  Metadata is backed by vm except for low quality file
systems.  Directories are backed by vm unless vfs.vmiodirenable is 0
(not the default).

> Just as an experiment, on a quiet system do:
> dd if=/dev/zero of=somefile bs=1m count=2048
> and then read it back in:
> dd if=somefile of=/dev/null bs=1m
> and watch systat or iostat and see if any of the file is read...  You'll
> probably see that none of it is...

Also, with systat -v:
- start with "inact" small and watch it grow as the file is cached
- remove the file and watch "inact" drop.

I haven't tried this lately.  The system has some defence against using up
all of the free and inactive pages for a single file to the exclusion of
other disk data, so you might not get 2GB cached even if you have 4GB memory.

> If that is in fact the case, then my question would be how to best
> increase the amount of memory the system can use for disk caching.

Just add RAM and don't run bloatware :-).

Bruce