Disk errors when copying

Ted Mittelstaedt tedm at toybox.placo.com
Sun Sep 9 23:02:24 PDT 2007



> -----Original Message-----
> From: Lars Eighner [mailto:luvbeastie at larseighner.com]
> Sent: Sunday, September 09, 2007 11:17 AM
> To: Ted Mittelstaedt
> Cc: Richard Tobin; freebsd-questions at freebsd.org
> Subject: RE: Disk errors when copying
>
>
> On Fri, 7 Sep 2007, Ted Mittelstaedt wrote:
>
> >
> >
> >> Subject: Disk errors when copying
> >>
> >>
> >> When copy between disks (ad10 ->ad8), I get errors:
> >>
> >> ad10: WARNING - READ_DMA48 UDMA ICRC error (retrying request)
> >> LBA=435128800
> >> ad10: FAILURE - READ_DMA48 status=51<READY,DSC,ERROR>
> >> error=10<NID_NOT_FOUND> LBA=435128800
> >> g_vfs_done():ad10s2g[READ(offset=175562145792, length=131072)]error = 5
> >>
> >> I don't get these errors just reading the data from ad10.  Is this
> >> some kind of system error rather than a bad disk?  Is it a
> known problem?
> >>
> >
> > Yes it is a known problem.  It does not happen with most combinations
> > of drives and controllers.  You need to exhaustively document the
> > motherboard/controller/hard disk and put it into a PR and file it
> > so that the developer can add your combo into his database.  The more
> > of these that are documented the quicker that a coorelation is going
> > to show up and get fixed.
>
> I wish I'd known that before I trashed my disc and spent a couple of weeks
> and hundreds of bucks building a new system.
>

One of the rules of thumb when you have hardware problems with a new
system (I'm assuming of course that these UDMA errors have been
happening since the system was built) is to search both the FreeBSD
questions mailing list archives, and the PR database - both closed and
open PRs.  Particularly closed PRs are a wealth of information because
so many of them are closed for lack of followup.

A typical scenario is someone will report a problem like your having
and 3 months later the developer will make a change in the code and
then ask the reporter to test the change and see if it fixed the
problem.  By then the original reporter has gone on to something else
and won't respond.  The developer then closes the PR and assumes whatever
he did fixed the problem.

If you do find closed PRs that are the same problem and same hardware
as yours, definitely refer to their numbers in your PR.

Ted



More information about the freebsd-questions mailing list