ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599

O. Hartmann ohartman at mail.uni-mainz.de
Tue Aug 9 08:23:36 GMT 2005


Mike Tancsa wrote:
> At 08:25 PM 08/08/2005, O. Hartmann wrote:
> 
>> Hello.
>>
>> My box is a FreeBSD 6.0-BETA2 driven ASUS A8N-SLI Deluxe based AMD64 
>> boxed (see dmesg).
>> One of  my SATA disks, the SAMSUNG SP2004C seems to show errors during 
>> operation (and also showd under 5.4-RELEASE-p3).
>> Sometimes I get this error:
>> ad10: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=11441599
>> while the machine still keeps working.
>> Other days the box crashes completely.
>>
>> Is this a operating system bug or is this message an evidence of 
>> defective hardware?
> 
> 
> You can probably confirm a hardware issue with the smartmon tools.  
> (/usr/ports/sysutils/smartmontools).
> 
> It was quite handy the other day for us to narrow down a problem between 
> a drive tray and the actual drive.  We started to see
> 
> Aug  3 02:02:49 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
> retries left) LBA=391423
> Aug  3 02:03:00 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
> retries left) LBA=2304319
> Aug  3 02:03:10 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
> retries left) LBA=2312927
> Aug  3 02:03:17 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
> retries left) LBA=2308639
> Aug  3 02:03:26 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
> retries left) LBA=2309855
> Aug  3 02:03:37 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
> retries left) LBA=2348359
> Aug  4 12:12:37 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
> retries left) LBA=1528639
> Aug  4 12:13:04 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (2 
> retries left) LBA=1530031
> Aug  4 12:13:04 verify1 kernel: ad0: TIMEOUT - READ_DMA retrying (1 
> retry left) LBA=1528639
> Aug  4 12:13:04 verify1 kernel: ad0: FAILURE - READ_DMA timed out
> Aug  4 12:13:04 verify1 kernel: spec_getpages:(ad0s1a) I/O read failure: 
> (error=5) bp 0xd630b4fc vp 0xc2640d68
> 
> Yet when we read the actual error info off the drive via smartctl -a 
> ad0, it was clean.  So it pointed to the drive tray which we swapped and 
> all was well.  In other situations however, the smart info will often 
> tell you if the drive is starting to fail.  Its not 100% reliable, but 
> since we started using it, it generally gave us some sort of heads up as 
> to whether or not a drive is in trouble.
> 
> 
>         ---Mike

Dear Mike.
Thanks a lot for this info.
I will use this tool and try to report what I found out.

I also use trays for my drives (like I did with SCSI and SCA2 on our 
servers at the lab). Maybe this could be an issue.

Oliver


More information about the freebsd-questions mailing list