Very low disk performance on 5.x
Robert Watson
rwatson at FreeBSD.org
Sun May 1 03:34:01 PDT 2005
On Sat, 30 Apr 2005, Arne WXrner wrote:
> 3. The man page geom(4) of R5.3 says "The GEOM framework
> provides an infrastructure in which "classes" can per-
> form transformations on disk I/O requests on their path
> from the upper kernel to the device drivers and back.
>
> Could it be, that geom slows something down (in some boxes the reading
> ops are very slow; in my box the writing ops are very slow)?
There are three types of overhead associated with GEOM, some of which
existed in 4.x also, just not under the name "GEOM". Some can be easily
characterized through benchmarking just on 5.x, other bits cannot.
Here they are:
(1) Fixed overhead per-transaction of entering and leaving the GEOM
framework. Because this involves context switches and queueing, this
overhead can be amortized under high transaction rates.
(2) Cost of entering each "GEOM module" as part of the framework, or
costs assocated with any GEOM module you might run, which typically
involves allocating a bio, as well as queueing operations.
(3) Cost of specific GEOM modules, such as transforms, RAID, etc -- may
include computation, scatter/gather of small I/Os into larger ones,
etc.
However, it's worth noting that GEOM also introdues performance benefits,
such as create a clean hand-off separation between the file system code
and the device code, so that MPSAFE devices can interact safely with
non-MPSAFE file systems, and in 6.x, MPSAFE file systems can interact
safely wit non-MPSAFE storage devices. It also permits parallelism --
various bits of storage processing and handling can be running on a
separate CPU from a file system generating a set of synchronous I/O's.
One interesting set of micro-benchmarks to identify the incremental costs
of (2) and (3) is to run identical I/O transations against the same
regions of physical disk using different layers in the partition stack.
I.e., against a region of ad0s1a, against an offset region on ad0s1, and
against a further offset region of ad0. If they're against the same
bits of disk, the main difference here will be the additional processing
of the layers in the stack. A little bit of math is required to figure
out the offset, but dd should be usable to figure out the incremental
cost.
Robert N M Watson
More information about the freebsd-performance
mailing list