kern/72451: Continuing problems with Silicon Image SATA controllers
Mikhail Teterin
mi at aldan.algebra.com
Fri Oct 8 10:01:05 PDT 2004
>Number: 72451
>Category: kern
>Synopsis: Continuing problems with Silicon Image SATA controllers
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Fri Oct 08 17:00:51 GMT 2004
>Closed-Date:
>Last-Modified:
>Originator: Mikhail Teterin
>Release: FreeBSD 5.3-BETA5 amd64
>Organization:
Virtual Estates, Inc.
>Environment:
System: FreeBSD pandora 5.3-BETA5 FreeBSD 5.3-BETA5 #4: Mon Sep 20 16:45:55 EDT 2004 mteterin at pandora:/backup/obj/usr/src/sys/DIOSCURI amd64
Relevant dmesg.boot entries:
atapci0: <SiI 3114 SATA150 controller> port 0x9c00-0x9c0f,0xa000-0xa003,0xa400-0xa407,0xa800-0xa803,0xac00-0xac07 mem 0xff3ff400-0xff3ff7ff irq 17 at device 11.0 on pci3
ad6: 190782MB <ST3200822AS/3.01> [387621/16/63] at ata3-master SATA150
Ident information from the running kernel:
$FreeBSD: src/sys/dev/ata/ata-all.c,v 1.227 2004/09/16 09:35:01 sos Exp $
$FreeBSD: src/sys/dev/ata/ata-queue.c,v 1.34 2004/08/27 14:48:32 sos Exp $
$FreeBSD: src/sys/dev/ata/ata-lowlevel.c,v 1.47 2004/09/03 12:10:44 sos Exp $
$FreeBSD: src/sys/dev/ata/ata-isa.c,v 1.22 2004/04/30 16:21:34 sos Exp $
$FreeBSD: src/sys/dev/ata/ata-pci.c,v 1.88 2004/08/20 06:19:25 sos Exp $
$FreeBSD: src/sys/dev/ata/ata-chipset.c,v 1.88 2004/09/10 10:31:37 sos Exp $
$FreeBSD: src/sys/dev/ata/ata-dma.c,v 1.131 2004/09/10 10:31:37 sos Exp $
$FreeBSD: src/sys/dev/ata/ata-disk.c,v 1.177 2004/09/01 12:15:44 sos Exp $
$FreeBSD: src/sys/dev/ata/atapi-cd.c,v 1.171 2004/08/24 10:39:00 sos Exp $
$FreeBSD: src/sys/dev/ata/atapi-fd.c,v 1.97 2004/08/05 21:11:33 sos Exp $
>Description:
Under _combined_ disk and CPU load, the following errors start
popping up:
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=53404031
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=54910687
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=56806527
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=61715903
ad6: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=62103999
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=176444927
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=311594591
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=196040671
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=306623743
After a while, all disk IO starts hanging and even a gracefull
reboot becomes impossible -- the machine hangs after saying:
"some processes would not die..."
We replaced the disk and the cables twice already.
Under just the disk load, the problem does not appear -- the
box survives a full run of `iozone -a' without a hitch, for
example.
But when we, for example, dump databases on it (over NFS) and,
at the same time, gzip the dump for archiving, we see this.
Or, when a big file is being uploaded with scp over a fast link
with ssh compression. So it looks like something inside the
ata driver is not attended to fast enough...
>How-To-Repeat:
Run `iozone -a' on a disk, while gzip-ing a big file off of
the same drive.
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list