WRITE command timeout

Vulpes Velox kitbsdlists at HotPOP.com
Sat Dec 20 14:14:08 PST 2003


On Sat, 20 Dec 2003 19:07:41 +0100
"Oivind H. Danielsen" <oivind.danielsen at kopek.net> wrote:

> Hello.
> 
> We have been running FreeBSD 4.6-5.1 systems for 1.5 years and are being
> plagued  by these:
> 
>  Dec 18 15:15:39 <> /kernel: ad0: WRITE command timeout tag=0 serv=0 -
> resetting
>  Dec 19 15:03:23 <> /kernel: ad0: READ command timeout tag=0 serv=0 -
> resetting

This is most likely cuased by the drive going bad or a bad cable.

> In our rack we have 34 identical drives (IBM IC35L080AVVA07).
> 
>   24 drives on Windows 2000    : no problems.
>    4 drives on Linux 2.4.x     : no problems.
> 
>    2 drives on RELENG_4_8
>     (VIA 82C686, VIA C3)       : no problems
> 
>    4 drives on RELENG_4_8
>     (nVIDIA nForce, XP 2000+)  : r/w timeouts, fs corruption.
> 
>   (1 drive/system, 6 FreeBSD boxes)
> 
> The good systems have been running the 1.5 years without a hitch. The
> four identical RELENG_4_8 systems have all had corrupted filesystems (at
> least once every two months).
> 
> 
> We have tried the following:
> 
>  - Changed ATA100 cables (3 diff. types, all 80-wire)
>  - Disabled DMA (use PIO4) (hw.ata.ata_dma="0" in loader.conf)
>  - Disabled DMA in BIOS setup
>  - Changed motherboard (MSI MS6734, VIA KM400, vt8235 ATA)
>  - Changed power supply (added 100W)
>  - RELENG_5_1.
> 
> None of these changes has helped. The only change seen when disabling
> DMA is  additional messages: "timeout waiting for DRQ - resetting".
> 
> 
> I have searched the net for more information on this topic for over a
> year, and  all I find is replies like:
> 
>   - "Just change the cable, dude.."   (did that, still timeouts)
>   - "IBM drives are bad for you."     (seen this with other drives too)
>                                       (drives work well on Linux/W2k)
>   - "Disabling DMA fixes it."         (tried that, it didn't)
>   - "ATA is for wimps. SCSI rulezz."  (different discussion)
> 
> 
> # sysctl hw.ata
> hw.ata.ata_dma: 0
> hw.ata.wc: 1
> hw.ata.tags: 0
> hw.ata.atapi_dma: 0
> 
> # atacontrol mode 0
> Master = PIO4
> Slave  = ???
> 
> # atacontrol info 0
> Master:  ad0 <IC35L080AVVA07-0/VA4OA52A> ATA/ATAPI rev 5
> Slave:       no device present
> 
> 
> dmesg, pciconf and kernel config are attached. No special compilation
> options  (except -DIPFW2) are used. I can provide more information on
> request.
> 
> We're now running FreeBSD 4.8-RELEASE-p14 and FreeBSD 5.1-RELEASE-p8,
> but the  problem has been around since we started out with 4.6 I
> believe.  The "good" and "bad" FreeBSD systems all use the same
> kernel/world.
> 
> 
> The reason why we have used such low-end hardware in these boxes is that
> they are  part of a highly redundant cluster solution for crypto
> processing (no storage is used for application purposes). This means the
> system can cope with the occasional fs corruption, but we would still
> prefer to get rid of it.
> 
> 
> I know this problem has been discussed before, but wanted to add more
> data to the  discussion. I don't think all of the reports should be
> attributed to bad HW. Nevertheless, even if the hardware is broken, the
> system should preferably  function equally well/bad as with Linux/W2k.
> 
> 
> Any help is greatly appreciated.
> 
> 
> Best Regards,
> 
> Oivind H. Danielsen
> 


More information about the freebsd-stable mailing list