gmirror

Sat May 14 10:15:18 PDT 2005

On Sat, 2005-05-14 at 18:00 +0400, Vladimir Dzhivsanoff wrote:

> three parallel tasks of "dd ...." is not good model for random reads ?

Probably not, especially if you start the parallel tasks going at the
same time.  That way, the second and third tasks are almost certainly
hitting data in the hard drive's cache, rather than the actual disk
platters, and so are more likely to be testing the interface transfer
speed than the hard drive's sustained performance.  For a better model
(of parallel large sequential transfers), you should at the very least
stagger the start times of each task, to minimise cache effects.

The better question to ask yourself is this: are large sequential
transfers a good model of my workload.  That is what you are testing
with your dd's.  Seek times are the dominant cost of a disk transfer.
Large sequential transfers are a best-case scenario for I/O measurements
because they involve minimal seek overheads.  However, "best-case" and
"real-world" are not usually the same thing.

Cheers,

Paul.
-- 
e-mail: paul at gromit.dlib.vt.edu

"Without music to decorate it, time is just a bunch of boring production
 deadlines or dates by which bills must be paid."
        --- Frank Vincent Zappa