misc/107443: dd fails to copy from disk to disk when bad sectors are involved

James Risner risner at stdio.com
Tue Jan 2 13:50:34 PST 2007


>Number:         107443
>Category:       misc
>Synopsis:       dd fails to copy from disk to disk when bad sectors are involved
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Jan 02 21:50:33 GMT 2007
>Closed-Date:
>Last-Modified:
>Originator:     James Risner
>Release:        5.5 and 6.2
>Organization:
OpenWorld, Inc
>Environment:
FreeBSD akira.stdio.com 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #2: Mon Sep 11 10:55:57 EDT 2006     risner at akira.stdio.com:/usr/obj/usr/src/sys/AKIRA  i386

>Description:
I have had this problem multiple times over the years.

I am unwise because I run drives with no redundancy until they have bad sectors, then copy the data from one drive to another (usually with dd.)

I first ran into the "you can not correctly copy disk to disk" problem with this ticket:
bin/22347: dd copies incorrect data after 2^32 bytes and I/O error

In that case, I solved the problem with a simple patch to dd source.

When that ticket was closed, I assumed the problem was fixed.  I recently had
another disk fail and copied the disk with dd:

Fdisk and partitioned the destination disk correctly.
dd if=/dev/ad0s1a bs=512 of=/dev/ad1s1a conv=noerror,sync
That partition was 8 gigabytes.

The approximately 2000 bad sectors cause the end result to be shifted down 4096 bytes.  By this I mean if I examined (with hexdump) a region of the partiton (/dev/ad1s1a) around 7 gb into the partition, it was not aligned to the source (/dev/ad0s1a).  This alignment problem means that fsck can not find the superblocks and all superblocks don't match (due to the alignment shift.)

I solved this problem by writing my own tiny dd program:

#include <stdio.h>
#include <fcntl.h>

char *src = "/dev/ad0s1a";
int fsrc;
char *dst = "/dev/ad1s1a";
int fdst;
char buf[256];

main()
{
int flen = 16777216;  // size of partition
size_t i, ret;

fsrc = open(src, O_RDONLY);
if (fsrc < 0)
        {
        perror("open src");
        exit(1);
        }
fcntl(fsrc, F_SETFL, O_NONBLOCK);
fdst = open(dst, O_WRONLY);
if (fdst < 0)
        {
        perror("open dst");
        exit(1);
        }
fcntl(fdst, F_SETFL, O_NONBLOCK);

printf("fsrc = %d, fdst = %d\n", fsrc, fdst);

for (i=0; i<flen; i++)
        {
        printf("%d\n", i);
        bzero(buf, 512);
        ret = pread(fsrc, buf, (size_t)512, (off_t)512*i);
        if (ret == 0)
                {
                printf("%d blocks written\n", i);
                exit(1);
                }
        if (ret<0)
                {
                printf("%d");
                perror("read");
                }
        ret = write(fdst, buf, 512);
        if (ret == 0)
                {
                printf("eof on write (%d)\n", i);
                exit(1);
                }
        if (ret<0)
                {
                printf("write error (%d)\n", i);
                perror("write");
                }
        }
}

>How-To-Repeat:

Find a failed drive with many bad sectors larger than 2 gb.

Copy from the failed drive to a new drive using dd on FreeBSD.

>Fix:

I have no idea.  My current fix was to use my tiny rdd program that I hand write on the spot when I identified the reason the new disk could not fsck was that the data was not aligned properly in the destination.

It seems when there are disk errors in the src, dd gets confused about the dst pointer and does not advance properly even with "sync" conversion?
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list