ad0: READ command timeout....

Stephen McKay smckay at internode.on.net
Fri May 23 07:01:05 PDT 2003


On Monday, 19th May 2003, Tenebrae wrote:

>On Sun, 18 May 2003, Dan Langille wrote:
>
>> This morning I found a frozen box.  On the console was this:
>>
>> ad0: READ command timeout tag=0 serv=0 - resetting
>> ata0: resetting devices .. ata0-slave: ATA identify retries exceeded
>> done
>>
>> After reboot, those messages were found in /var/log/messages.
>>
>> I'm running FreeBSD 4.8-RC from Apr 4 10:45:49 EST 2003.
>
>I've had similar problems.  This has come up a few times on the mailing
>list in the past, especially regarding certain lines of IBM drives.
>Suggestions include replacing your IDE cable, making sure it's not longer
>than 18 inches...and replacing the drive as it's probably on its last leg.
>Back it up while you can.

I am plagued with all sorts of ata command timeout problems.  Some of the
hardware is brand new, some of it not.  I think the problem is in the ata
code.  If not, how can I get some detailed reporting so I can trace the
true culprit?  How can I get at the so called SMART info?  And what's in
the -current ata driver that hasn't made it back into -stable?

I have an IBM disk:
    ad4: 78533MB <IC35L080AVVA07-0> [159560/16/63] at ata2-master UDMA100

On both a Via 686B and a SiI0680 raid controller, this drive will give
errors like this when under load:

May 23 23:00:11 peon /kernel: ad4: READ command timeout tag=0 serv=0 - resetting
May 23 23:00:11 peon /kernel: ata2: resetting devices .. ad4: DMA limited to UDM
A33, non-ATA66 cable or device
May 23 23:00:11 peon /kernel: done

Obviously it has an ATA66 compliant cable or it would never go to UDMA100
in the first place.  After the "command timeout" suddenly it's no longer
compliant.  I've been forced to run this drive permanently at UDMA33 just
to get some use out of it.

I have a brand new Maxtor disk:
    ad6: 117246MB <Maxtor 6Y120L0> [238216/16/63] at ata3-master UDMA133

This also has a fun time on the SiI0680 controller:

May 23 23:00:09 peon /kernel: ad6: WRITE command timeout tag=0 serv=0 - resettin
g
May 23 23:00:09 peon /kernel: ata3: resetting devices .. done
...
May 23 23:11:42 peon /kernel: ad6: READ command timeout tag=0 serv=0 - resetting
May 23 23:11:42 peon /kernel: ata3: resetting devices .. done
...
May 23 23:28:15 peon /kernel: ad6: READ command timeout tag=0 serv=0 - resetting
May 23 23:28:15 peon /kernel: ata3: resetting devices .. done
May 23 23:28:54 peon /kernel: ad6: READ command timeout tag=0 serv=0 - resetting
May 23 23:28:54 peon /kernel: ata3: resetting devices .. done

The last few were after I throttled it back to UDMA100.  I suppose I'll have
to run it at UDMA33 too, just to get stability.

By the way, the "WRITE command timeout" caused file corruption.  I would
have thought the write would have been retried.  Apparently not.  Luckily
I'm in the habit of diffing the results of large file copies against the
originals.

Recently I mentioned I have a Liteon 24x burner on a Via 686 controller:
    acd0: CD-RW <LITE-ON LTR-24102B> at ata1-master UDMA33

I have to run it at WDMA2 or I get DMA errors.  There are some rumours
that this is Liteon's fault.  Maybe not when I look at how many of my
IDE devices spit out ata error messages.

And I don't just live in a weird anti-computer zone. :-)  All my SCSI gear
just works.  And always at the rated speed too.  Pity I can afford so little
of it. :-(

Stephen.


More information about the freebsd-stable mailing list