kern/72451: Continuing problems with Silicon Image SATA controllers

Mikhail Teterin mi at aldan.algebra.com
Fri Oct 8 10:01:05 PDT 2004


>Number:         72451
>Category:       kern
>Synopsis:       Continuing problems with Silicon Image SATA controllers
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Oct 08 17:00:51 GMT 2004
>Closed-Date:
>Last-Modified:
>Originator:     Mikhail Teterin
>Release:        FreeBSD 5.3-BETA5 amd64
>Organization:
Virtual Estates, Inc.
>Environment:
System: FreeBSD pandora 5.3-BETA5 FreeBSD 5.3-BETA5 #4: Mon Sep 20 16:45:55 EDT 2004 mteterin at pandora:/backup/obj/usr/src/sys/DIOSCURI amd64

Relevant dmesg.boot entries:

	atapci0: <SiI 3114 SATA150 controller> port 0x9c00-0x9c0f,0xa000-0xa003,0xa400-0xa407,0xa800-0xa803,0xac00-0xac07 mem 0xff3ff400-0xff3ff7ff irq 17 at device 11.0 on pci3
	ad6: 190782MB <ST3200822AS/3.01> [387621/16/63] at ata3-master SATA150

Ident information from the running kernel:

     $FreeBSD: src/sys/dev/ata/ata-all.c,v 1.227 2004/09/16 09:35:01 sos Exp $
     $FreeBSD: src/sys/dev/ata/ata-queue.c,v 1.34 2004/08/27 14:48:32 sos Exp $
     $FreeBSD: src/sys/dev/ata/ata-lowlevel.c,v 1.47 2004/09/03 12:10:44 sos Exp $
     $FreeBSD: src/sys/dev/ata/ata-isa.c,v 1.22 2004/04/30 16:21:34 sos Exp $
     $FreeBSD: src/sys/dev/ata/ata-pci.c,v 1.88 2004/08/20 06:19:25 sos Exp $
     $FreeBSD: src/sys/dev/ata/ata-chipset.c,v 1.88 2004/09/10 10:31:37 sos Exp $
     $FreeBSD: src/sys/dev/ata/ata-dma.c,v 1.131 2004/09/10 10:31:37 sos Exp $
     $FreeBSD: src/sys/dev/ata/ata-disk.c,v 1.177 2004/09/01 12:15:44 sos Exp $
     $FreeBSD: src/sys/dev/ata/atapi-cd.c,v 1.171 2004/08/24 10:39:00 sos Exp $
     $FreeBSD: src/sys/dev/ata/atapi-fd.c,v 1.97 2004/08/05 21:11:33 sos Exp $

>Description:
	Under _combined_ disk and CPU load, the following errors start
	popping up:

ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=53404031
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=54910687
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=56806527
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=61715903
ad6: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=62103999
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=176444927
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=311594591
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=196040671
ad6: FAILURE - WRITE_DMA status=51<READY,DSC,ERROR> error=4<ABORTED> LBA=306623743

	After a while, all disk IO starts hanging and even a gracefull
	reboot becomes impossible -- the machine hangs after saying:
	"some processes would not die..."

	We replaced the disk and the cables twice already.

	Under just the disk load, the problem does not appear -- the
	box survives a full run of `iozone -a' without a hitch, for
	example.

	But when we, for example, dump databases on it (over NFS) and,
	at the same time, gzip the dump for archiving, we see this.

	Or, when a big file is being uploaded with scp over a fast link
	with ssh compression. So it looks like something inside the
	ata driver is not attended to fast enough...

>How-To-Repeat:

	Run `iozone -a' on a disk, while gzip-ing a big file off of
	the same drive.

>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list