dd(1) performance when copiing a disk to another (fwd)

Poul-Henning Kamp phk at phk.freebsd.dk
Tue Oct 4 05:14:51 PDT 2005


Robert forwarded this message.

>---------- Forwarded message ----------
>Date: Tue, 4 Oct 2005 10:48:48 +1000 (EST)
>From: Bruce Evans <bde at zeta.org.au>
>To: Tulio Guimar=E3es da Silva <tuliogs at pgt.mpt.gov.br>
>Cc: freebsd-performance at FreeBSD.org
>Subject: Re: dd(1) performance when copiing a disk to another

I raised this subject early in the GEOM era but got very little
feedback, so I decided to sit back and wait until it came up again,
and that seems to be now.

First issue: chopping requests.

In the future we will have even larger I/O requests because (at least
we hope) that bio requests will get rid of the antique requirement to
be mapped into sequential mapped kernel VM.

That means that somebody will have to cut I/O requests up somewhere
and it stands to reason that this happens as far down as possible
for reasons of memory management and workload avoiddance.

So in the future, device drivers will have to accept for all practical
purposes infinite bio requests and service them in pieces as best
they can.

In addition to chopping, drivers/classes which need to access the
data in the I/O request will need to request VM mapping of it.


Second issue: issuing intelligently sized/aligned requests.

Notwithstanding the above, it makes sense to issue requests that
work as efficient as possible further down the GEOM mesh.

The chopping is one case, and it can (and will) be solved by
propagating a non-mandatory size-hint upwards.  physio will
be able to use this to send down requests that require minimal
chopping later on.

But the other issue is alignment.  For a RAID-5 implementation it
is paramount for performance that requests try to align themselves
with the stripe size.  Other transformations have similar
requirements, striping and (gbde) encryption for instance.

Therefore in addition to the size hint, a stripe width and stripe
alignment hint needs to be passed up and then physio can start
to send requests that not only have the right size, but also
the right alignment for downstream processing.

The outline of this was committed to src/sys/geom/notes around
2½ years ago and the only thing that has changed is that after
some consideration I have concluded that the hints should be
non-binding for performance reasons.


Third issue: The problem extends all the way up to sysinstall.

Currently we do systematically shoot RAID-5 performance down by our
strict adherence to MBR formatting rules.  We reserve the first
track of typically 63 sectors to the MBR.

The first slice therefore starts in sector number 63.  All partitions
in that slice inherit that alignment and therefore unless the RAID-5
implementation has a stripe size of 63 sectors, a (too) large
fraction of the requests will have one sector in one raid-stripe
and the rest in another, which they often fail to fill by exactly
one sector.


-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


More information about the freebsd-performance mailing list