WRITE command timeout
Vulpes Velox
kitbsdlists at HotPOP.com
Sat Dec 20 14:14:08 PST 2003
On Sat, 20 Dec 2003 19:07:41 +0100
"Oivind H. Danielsen" <oivind.danielsen at kopek.net> wrote:
> Hello.
>
> We have been running FreeBSD 4.6-5.1 systems for 1.5 years and are being
> plagued by these:
>
> Dec 18 15:15:39 <> /kernel: ad0: WRITE command timeout tag=0 serv=0 -
> resetting
> Dec 19 15:03:23 <> /kernel: ad0: READ command timeout tag=0 serv=0 -
> resetting
This is most likely cuased by the drive going bad or a bad cable.
> In our rack we have 34 identical drives (IBM IC35L080AVVA07).
>
> 24 drives on Windows 2000 : no problems.
> 4 drives on Linux 2.4.x : no problems.
>
> 2 drives on RELENG_4_8
> (VIA 82C686, VIA C3) : no problems
>
> 4 drives on RELENG_4_8
> (nVIDIA nForce, XP 2000+) : r/w timeouts, fs corruption.
>
> (1 drive/system, 6 FreeBSD boxes)
>
> The good systems have been running the 1.5 years without a hitch. The
> four identical RELENG_4_8 systems have all had corrupted filesystems (at
> least once every two months).
>
>
> We have tried the following:
>
> - Changed ATA100 cables (3 diff. types, all 80-wire)
> - Disabled DMA (use PIO4) (hw.ata.ata_dma="0" in loader.conf)
> - Disabled DMA in BIOS setup
> - Changed motherboard (MSI MS6734, VIA KM400, vt8235 ATA)
> - Changed power supply (added 100W)
> - RELENG_5_1.
>
> None of these changes has helped. The only change seen when disabling
> DMA is additional messages: "timeout waiting for DRQ - resetting".
>
>
> I have searched the net for more information on this topic for over a
> year, and all I find is replies like:
>
> - "Just change the cable, dude.." (did that, still timeouts)
> - "IBM drives are bad for you." (seen this with other drives too)
> (drives work well on Linux/W2k)
> - "Disabling DMA fixes it." (tried that, it didn't)
> - "ATA is for wimps. SCSI rulezz." (different discussion)
>
>
> # sysctl hw.ata
> hw.ata.ata_dma: 0
> hw.ata.wc: 1
> hw.ata.tags: 0
> hw.ata.atapi_dma: 0
>
> # atacontrol mode 0
> Master = PIO4
> Slave = ???
>
> # atacontrol info 0
> Master: ad0 <IC35L080AVVA07-0/VA4OA52A> ATA/ATAPI rev 5
> Slave: no device present
>
>
> dmesg, pciconf and kernel config are attached. No special compilation
> options (except -DIPFW2) are used. I can provide more information on
> request.
>
> We're now running FreeBSD 4.8-RELEASE-p14 and FreeBSD 5.1-RELEASE-p8,
> but the problem has been around since we started out with 4.6 I
> believe. The "good" and "bad" FreeBSD systems all use the same
> kernel/world.
>
>
> The reason why we have used such low-end hardware in these boxes is that
> they are part of a highly redundant cluster solution for crypto
> processing (no storage is used for application purposes). This means the
> system can cope with the occasional fs corruption, but we would still
> prefer to get rid of it.
>
>
> I know this problem has been discussed before, but wanted to add more
> data to the discussion. I don't think all of the reports should be
> attributed to bad HW. Nevertheless, even if the hardware is broken, the
> system should preferably function equally well/bad as with Linux/W2k.
>
>
> Any help is greatly appreciated.
>
>
> Best Regards,
>
> Oivind H. Danielsen
>
More information about the freebsd-stable
mailing list