disk I/O, VFS hirunningspace
Matthew Dillon
dillon at apollo.backplane.com
Wed Jul 14 06:19:59 UTC 2010
:void
:waitrunningbufspace(void)
:{
:/*
: mtx_lock(&rbreqlock);
: while (runningbufspace > hirunningspace) {
: ++runningbufreq;
: msleep(&runningbufreq, &rbreqlock, PVM, "wdrain", 0);
: }
: mtx_unlock(&rbreqlock);
:*/
:}
:
:so far, I can't observe any side effects of not running it. Am I on a time
:bomb?
:
:Thank you,
:Jerry
You can bump up the related sysctl for hirunningspace if it helps
you, no kernel code modification is needed. I recommend setting it
to at least 8MB (8388608).
sysctl vfs.hirunningspace=8388608
sysctl vfs.lorunningspace=1048576
The waitrunningbufspace() code is designed to protect the system from
several degenerate situations and should be left in place.
One is where a large backlog of issued WRITE BIOs can accumulate on
block devices. Because the related buffers are locked during the I/O,
any attempt to access the data via the buffer cache will unnecessarily
stall the thread trying to access it. Without a limit several seconds
worth of BIOs can accumulate (sometimes tens of seconds worth if the
I/O is non-linear). Both accesses to file data and accesses to meta-data
can wind up stalling, reducing filesystem peformance.
A second issue is that system buffer cache algorithms will become
severely inefficient if too much of the buffer cache is held in a
locked state.
That said, the defaults in bufinit() (lines 623 and 624) are a bit
too low for today's high-speed I/O subsystems. They appear to be set
to fixed assignments of 512K for lo and 1MB for hi. Even though the
defaults are too low they still ought to be enough to maintain maximum
I/O throughput since WRITE BIOs usually complete very quickly (they
just go into the target device's own write cache and complete). The
pipeline should be maintained if the hysteresis is working properly.
Perhaps there is something else broken that is causing the hystersis
to not work properly.
-Matt
More information about the freebsd-hackers
mailing list