dd(1) performance when copying a disk to another
Tulio Guimarães da Silva
tuliogs at pgt.mpt.gov.br
Mon Oct 3 08:14:29 PDT 2005
Phew, thanks for that. :) This seems to answer my question in the
other "leg" of the thread, though it hadn´t yet arrived to me when I
wrote the message, though.
Now THAT´s a quite good explanation. ;) Thanks again,
Tulio G. da Silva
Bruce Evans wrote:
> On Mon, 3 Oct 2005, Patrick Proniewski wrote:
>
>>>>> # dd if=/dev/ad4 of=/dev/null bs=1m count=1000
>>>>> 1000+0 records in
>>>>> 1000+0 records out
>>>>> 1048576000 bytes transferred in 17.647464 secs (59417943
>>>>> bytes/sec)
>>>>
>
> Many wrong answers to the original question have been given. dd with
> a blocks size of 1m between (separate) disk devices is much slower
> just because that block size is far too large...
>
> The above is a fairly normal speed. The expected speed depends mainly
> on the disk technology generation and the placement of the sectors being
> read. I get the following speeds for _sequential_ _reading- from the
> outer (fastest) tracks of 6- and 3-year old drives which are about 2
> generations apart:
>
> %%%
> Sep 25 21:52:35 besplex kernel: ad0: 29314MB <IBM-DTLA-307030>
> [59560/16/63] at ata0-master UDMA100
> Sep 25 21:52:35 besplex kernel: ad2: 58644MB <IC35L060AVV207-0>
> [119150/16/63] at ata1-master UDMA100
> ad0 bs 512: 16777216 bytes transferred in 2.788209 secs (6017201
> bytes/sec)
> ad0 bs 1024: 16777216 bytes transferred in 1.433675 secs (11702245
> bytes/sec)
> ad0 bs 2048: 16777216 bytes transferred in 0.787466 secs (21305320
> bytes/sec)
> ad0 bs 4096: 16777216 bytes transferred in 0.479757 secs (34970249
> bytes/sec)
> ad0 bs 8192: 16777216 bytes transferred in 0.477803 secs (35113250
> bytes/sec)
> ad0 bs 16384: 16777216 bytes transferred in 0.462006 secs (36313842
> bytes/sec)
> ad0 bs 32768: 16777216 bytes transferred in 0.462038 secs (36311331
> bytes/sec)
> ad0 bs 65536: 16777216 bytes transferred in 0.486850 secs (34460748
> bytes/sec)
> ad0 bs 131072: 16777216 bytes transferred in 0.462046 secs (36310693
> bytes/sec)
> ad0 bs 262144: 16777216 bytes transferred in 0.469866 secs (35706382
> bytes/sec)
> ad0 bs 524288: 16777216 bytes transferred in 0.462035 secs (36311555
> bytes/sec)
> ad0 bs 1048576: 16777216 bytes transferred in 0.478534 secs (35059612
> bytes/sec)
> ad2 bs 512: 16777216 bytes transferred in 4.115675 secs (4076419
> bytes/sec)
> ad2 bs 1024: 16777216 bytes transferred in 2.105451 secs (7968466
> bytes/sec)
> ad2 bs 2048: 16777216 bytes transferred in 1.132157 secs (14818809
> bytes/sec)
> ad2 bs 4096: 16777216 bytes transferred in 0.662452 secs (25325935
> bytes/sec)
> ad2 bs 8192: 16777216 bytes transferred in 0.454654 secs (36901065
> bytes/sec)
> ad2 bs 16384: 16777216 bytes transferred in 0.304761 secs (55050416
> bytes/sec)
> ad2 bs 32768: 16777216 bytes transferred in 0.304761 secs (55050416
> bytes/sec)
> ad2 bs 65536: 16777216 bytes transferred in 0.304765 secs (55049683
> bytes/sec)
> ad2 bs 131072: 16777216 bytes transferred in 0.304762 secs (55050200
> bytes/sec)
> ad2 bs 262144: 16777216 bytes transferred in 0.304760 secs (55050588
> bytes/sec)
> ad2 bs 524288: 16777216 bytes transferred in 0.304762 secs (55050200
> bytes/sec)
> ad2 bs 1048576: 16777216 bytes transferred in 0.304757 secs (55051148
> bytes/sec)
> %%%
>
> Drive technology hit a speed plateau a few years ago so newer single
> drives
> aren't much faster unless they are more expensive and/or smaller.
>
> The speed is low for small block sizes because the device has to be
> talked too too much and the protocol and firmware are not very good.
> (Another drive, a WDC 120GB with more cache (8MB instead of 2), ramps
> up to about half speed (26MB/sec) for a block size of 4K but sticks
> at that speed for block sizes 8K and 16K, then jumps up to full speed
> for a block sizes of 32K and larger. This indicates some firmware
> stupidness). Most drives ramp up almost logarithmically (doubling
> the block size almost doubles the speed). This behaviour is especially
> evident on slow SCSI drives like some (most?) ZIP and dvd/cd. The
> command overhead can be 20 msec, so you had better not do 1 512 bytes
> of i/o per command or you will get a speed of 25K/sec. The command
> overhead of a new ATA drive is more like 50 usec, but that is still
> far too much for high speed with a block size of 512 bytes.
>
> The speed is insignificantly different for block sizes larger than a
> limit because the drive's physical limits dominate except possibly
> with old (slow) CPUs.
>
>>>> That seems to be 2 or about 2 times faster than disc->disc
>>>> transfer... But still slower, than I would have expected...
>>>> SATA150 sounds like the drive can do 150MB/sec...
>>>
>>
>> As Eric pointed out, you just can"t reach 150 MB/s with one disk,
>> it's a technological maximum for the bus, but real world performance
>> is well bellow this max.
>> In fact, I've though I would reach about 50 to 60 MB/s.
>
>
> 50-60 MB/s is about right. I haven't benchmarked any SATA or very new
> drives. Apparently they are not much faster. ISTR that WDC Raptors are
> speced for 70-80MB/sec. You pay twice as much to get a tiny drive with
> only 25% more throughput plus faster seeks.
>
>>>>>> (Maybe you could find a way to copy /dev/zero to /dev/ad6
>>>>>> without destroying the previous work... :-))
>>>>>
>>>>>
>>>>> well, not very easy both disk are the same size ;)
>>>>
>>
>>>> I thought of the first 1000 1MB blocks... :-)
>>>
>>
>> damn, I misread this one... :)
>> I'm gonna try this asap.
>
>
> I divide disks into equally sized (fairly small, or half the disk size)
> partitions, and cp between them. dd is too hard to use for me ;-). cp
> is easier to type and automatically picks a reasonable block size. Of
> course I use dd if the block size needs to be controlled, but mostly I
> only use it in preference to cp to get its timing info.
>
>> ...
>>
>>> Have you tried a smaller block size? What does 8k, 16k, or 512k do
>>> for you? There really isn't much room for improvement here on a
>>> single device.
>>
>>
>> nop, I'll try one of them, but I can't do many experiments, the box
>> is in my living room, it's a 1U rack, and it's VERY VERY noisy. My
>> girlfriend will kill me if it's running more than an hour a day :))
>
>
> Smaller block sizes will go much faster, except for copying from a
> disk to
> itself. Large block sizes are normally a pessimization and the
> pessimization
> is especially noticeable for dd. Just use the smallest block size
> that gives
> an almost-maximal throughput (e.g., 16K for reading ad2 above, possibly
> different for writing). Large block sizes are pessimal for synchronous
> i/o like dd does. The timing for dd'ing blocks of size N MB at R MB/sec
> between ad0 and ad2 is something like:
>
> time in secs activity on ad0 activity on ad2
> ------------ --------------- ---------------
> 0 start read of 1MB idle
> N/R finish read; idle start write of 1MB
> N/R-epsilon start read of 1MB pretend to complete write
> N/R continue read complete write
> N/R-epsilon finish read; idle start write of 1MB
> N/R-2*epsilon ... ...
>
> After the first block (which takes a little longer), it takes N/R-epsilon
> seconds to copy 1 block, where epsilon is the time between the writer's
> pretending to complete the write and actually completing it. This time
> is obviously not very dependent on the block size since it is limited by
> drives resources and policies (in particular, if the drive doesn't do
> write
> caching, perhaps because write caching is not enabled, then epsilon is 0,
> and if out block size is large compared with the drive's cache then the
> drive won't be able to signal completion until no more than the drive's
> cache size is left to do). Thus epsilon becomes small relative to the
> N/R term when N is large. Apparently, in your case the speed drops from
> 59MB/sec to 35MB/sec, so with N == 1 and R == 59, epsilon is about 1/200.
>
> With large block sizes, the speed can be increased using asyncronous
> output.
> There is a utility (in ports) named team that fakes async output using
> separate processes. I have never used it. Somthing as simple as 2
> dd's in a pipe should work OK.
>
> For copying from a disk itself, a large block sizes is needed to limit
> the
> number of seeks, and concurrent reads and writes are exactly what is not
> needed (since they would give competing seeks). The i/o must be
> sequentialized, and dd does the right things for this, though the drive
> might not (you would prefer epsilon == 0, since if the drive signals
> write completion early then it might get confused when you flood it
> with the next read and seek to start the read before it completes the
> write, then thrash back and forth between writing and reading).
>
> It is interesting that writing large sequential files to at least the
> ffs file system (not mounted with -sync) in FreeBSD is slightly faster
> than writing directly to the raw disk using write(2), even if the
> device driver sees almost the same block sizes for these different
> operations. This is because write(2) is synchronous and sync writes
> always cause idle periods (the idle periods are just much smaller for
> writing data that is already in memory), while the kernel uses async
> writes for data.
>
> Bruce
> _______________________________________________
> freebsd-performance at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to
> "freebsd-performance-unsubscribe at freebsd.org"
>
>
More information about the freebsd-performance
mailing list