5.4-RC2 freezing - ATA related?

Søren Schmidt sos at FreeBSD.ORG
Sat May 21 18:15:13 PDT 2005


On 22/05/2005, at 2:36, Thomas Hurst wrote:

> * Søren Schmidt (sos at FreeBSD.ORG) wrote:
>
>
>> No, my only advise is to use the ATA mkIII patches or better yet -
>> current..
>>
>
> In a similar vein, I'm seeing the same WRITE_DMA timeouts and system
> lockups using ATA mkIII patches as I did using the standard RELENG_5
> driver, on two seperate systems.
>
> I'm getting the WRITE_DMA retries on a multi-gmirror Athlon system  
> using
> a PCI SATA card; the two PATA drives on the system are fine:
>
>  FreeBSD 5.4-STABLE #0: Thu Apr 28 06:31:53 BST 2005
>  atapci1: <SiI 3112 SATA150 controller> port
>   0xcc00-0xcc0f, 
> 0xc800-0xc803,0xc400-0xc407,0xc000-0xc003,0xbc00-0xbc07
>   mem 0xe7062000-0xe70621ff irq 11 at device 12.0 on pci0
>  ad4: 381554MB <ST3400832AS/3.01> [775221/16/63] at ata2-master  
> SATA150
>  ad6: 381554MB <ST3400832AS/3.01> [775221/16/63] at ata3-master  
> SATA150
>  ..
>  ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=401743679
>  ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=781421759
>
> It seems harmless, but results in writes freezing for several seconds
> every couple of hundred MB (annoying with 360G of storage as you might
> imagine).  It normally favours a single drive, but seems to bounce
> between ad4 and 6 for no apparant reason.  Replacing the SATA card and
> cables has no effect.  Attempting to drop the drives to PIO with
> atacontrol doesn't seem to do anything either (they remain at  
> SATA150).
>
> The other system where I see the lockups (I used to get READ/WRITE_DMA
> timeouts with the lockup many moons ago, which seems to have started
> after a system update, but for the past 6+ months or so I just get the
> lockup) is an old BP6 (dual Celeron), on two different channels on two
> different drive:
>
>  FreeBSD 5.4-STABLE #2: Tue Apr 26 17:59:25 BST 2005
>  atapci1: <HighPoint HPT366 UDMA66 controller> port
>    0xd800-0xd8ff,0xd400-0xd403,0xd000-0xd007 irq 18 at device 19.0  
> on pci0
>  atapci2: <HighPoint HPT366 UDMA66 controller> port
>    0xe400-0xe4ff,0xe000-0xe003,0xdc00-0xdc07 irq 18 at device 19.1  
> on pci0
>  ad4: 76319MB <Seagate ST380011A 3.04> at ata2-master UDMA66
>  ad6: 114473MB <Seagate ST3120026A 3.01> at ata3-master UDMA66
>
> Setting these drives to PIO4 resolves the stability problems (which
> again only occurs under heavy disk activity, almost always on writes),
> but makes the system crawl.  I'm planning on migrating it to gmirror,
> which I expect will make it behave more like the Athlon, but obviously
> I'd like to be able to use DMA reliably without resorting to RAID-1
> everywhere.
>
> Save me Søren!

You have picked some of the most dreaded HW out there thats for sure,  
so I'm not sure I can do that :)
Anyhow, you should try a recent -current since some of the race/ 
timeout problems thats possible in 5.x has been fixed there.

- Søren





More information about the freebsd-stable mailing list