Very low disk performance on 5.x

Scott Long scottl at samsco.org
Sun May 8 06:05:42 PDT 2005


Steven Hartland wrote:
> Summary of results:
> RAID0:
> Changing vfs.read_max 8 -> 16 and MAXPHYS 128k -> 1M
> increased read performance significantly from 129Mb/s to 199MB/s
> Max raw device speed here was 234Mb/s
> FS -> Raw device: 35Mb/s 14.9% performance loss
> 
> RAID5: Changing vfs.read_max 8 -> 16 produced a small increase
> 129Mb/s to 135Mb/s.
> 
> Increasing MAXPHYS 128k -> 1M  prevented vfs.read_max from
> having any effect
> Max raw device speed here was 200Mb/s
> FS -> Raw device: 65Mb/s 32.5% performance loss
> 
> Note: This batch of tests where done on uni processor kernel to
> keep variation down to a minimum, so are not directly comparable
> with my previous tests. All tests where performed with 16k RAID
> stripe across all 5 disks and a default newfs. Increasing or decreasing
> the block size for the fs was tried but only had negative effects.

Changing MAXPHYS is very dangerous, unfortunately.  The root of the
problem is that kernel virtual memory (KVA) gets assigned to each I/O
buffer as it passes through the kernel. If we allow too much I/O through
at once then we have the very real possibility of exhausting the kernel
address space and causing a deadlock and/or panic.  That is why MAXPHYS
is set so low.  Your DD test is unlikely to trigger a problem, but try
doing a bunch of DD's is parallel and you likely will.

The solution is to re-engineer the way that I/O buffers pass through
the kernel and only assign KVA when needed (for doing software parity
calculations, for example).  That way we could make MAXPHYS be any
arbitrarily large number and not worry about exhausting KVA.  I believe
that there is some work in progress in this area, but it's a large
project since nearly every single storage driver would need to be
changed.  Another possibility is to recognise that amd64 doesn't have
the same KVA restrictions as i386 and thus can be treated differently.
However, doing the KVA work is still attractive since it'll yeild some
performance benefits too.

Scott


More information about the freebsd-performance mailing list