disk I/O, VFS hirunningspace

Matthew Dillon dillon at apollo.backplane.com
Wed Jul 14 06:19:59 UTC 2010


:void
:waitrunningbufspace(void)
:{
:/*
:        mtx_lock(&rbreqlock);
:        while (runningbufspace > hirunningspace) {
:                ++runningbufreq;
:                msleep(&runningbufreq, &rbreqlock, PVM, "wdrain", 0);
:        }
:        mtx_unlock(&rbreqlock);
:*/
:}
:
:so far, I can't observe any side effects of not running it. Am I on a time
:bomb?
:
:Thank you,
:Jerry

    You can bump up the related sysctl for hirunningspace if it helps
    you, no kernel code modification is needed.  I recommend setting it
    to at least 8MB (8388608).

	sysctl vfs.hirunningspace=8388608
	sysctl vfs.lorunningspace=1048576

    The waitrunningbufspace() code is designed to protect the system from
    several degenerate situations and should be left in place.
    One is where a large backlog of issued WRITE BIOs can accumulate on
    block devices.  Because the related buffers are locked during the I/O,
    any attempt to access the data via the buffer cache will unnecessarily
    stall the thread trying to access it.  Without a limit several seconds
    worth of BIOs can accumulate (sometimes tens of seconds worth if the
    I/O is non-linear).  Both accesses to file data and accesses to meta-data
    can wind up stalling, reducing filesystem peformance.

    A second issue is that system buffer cache algorithms will become
    severely inefficient if too much of the buffer cache is held in a
    locked state.

    That said, the defaults in bufinit() (lines 623 and 624) are a bit
    too low for today's high-speed I/O subsystems.  They appear to be set
    to fixed assignments of 512K for lo and 1MB for hi.  Even though the
    defaults are too low they still ought to be enough to maintain maximum
    I/O throughput since WRITE BIOs usually complete very quickly (they
    just go into the target device's own write cache and complete).  The
    pipeline should be maintained if the hysteresis is working properly.
    Perhaps there is something else broken that is causing the hystersis
    to not work properly.

						-Matt



More information about the freebsd-hackers mailing list