WRITE_DMA errors on SATA drive under 5.3-RELEASE

Anthony Atkielski atkielski.anthony at wanadoo.fr
Mon Feb 28 19:27:45 GMT 2005

Garance A Drosihn writes:

> First question: which SATA controller are you using?

The controller is built into the Asus P4P800-E motherboard, and is
based on the Intel ICH5R southbridge chipset.  There's also a Promise
20378 RAID controller on board but I do NOT use it (disabled in BIOS).

> And what is the make&model of the hard drives that you are using?

The SATA drives are two identical Western Digital WD1200JD 120-GB
drives, 7200 RPM.  Device ad10 holds /tmp and /var; device ad12 holds

There is also a third drive, an older Samsung SV4002H (40 GB), connected
to the primary IDE controller.  This drive holds the root /.

Although the error messages I've seen name ad10 (the first SATA drive),
smartctl says that no errors have occurred on either of these
drives--whereas it does show a log of errors on the third drive (ad0)
that seem to correspond mysterious to the errors in the message.

> Note: There have been several different threads on different mailing
> lists from users having WRITE_DMA errors similar to this. At least
> some of the problem is in the code which handles disk I/O.

So I've surmised.  The problem seems to be quite rare, but since this is
a production server I worry about disk writes not being completed; I
have no easy way to tell whether writes were actually lost or not.

> I realize that none of that info really helps you right now, but
> I just thought I would say that it may be you're not having any
> hardware problems.  Or at least, not on the disk itself.  It might
> be a problem with the disk-controller, or it might be fairly minor
> timing-problems that come up under certain kinds of load.

I don't think there are any hardware problems at all.  This isn't a
terribly exotic configuration.  It's probably a bug or configuration


