ATA kablooie

Matthew D. Fuller fullermd at over-yonder.net
Mon Mar 12 10:17:08 UTC 2007


I have a box that until Friday night was running a Nov '05 -CURRENT
solidly.  After an upgrade, it started spewing out

kernel: ad4: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=38617823

style warnings at the slightest provocation.  A "find / -xdev -print |
xargs cat >> /dev/null" could bring it about in a second or two; not
uncommonly, the arduous effort of spawning off 'sh' for single-user
mode was enough to put it over the cliff.

The system runs an ataraid RAID-1 across ad4 and ad6; which got the
first errors was pretty luck of the draw on any given boot.  They're
on a Promise TX2200 card:

atapci0: <Promise PDC20571 SATA150 controller> port 0xc000-0xc07f,0xc400-0xc4ff mem 0xeb420000-0xeb420fff,0xeb400000-0xeb41ffff irq 15 at device 13.0 on pci0

The card/drives were tried in 3 very different motherboards, all of
which failed identically.  BIOSen were scoured for "make PCI edgy"
options, which were all turned off (though none exhibited a "enable
bus master" option, as one seemingly-related mail thread ended with).
I tried using the loader variable to force the drives to PIO mode to
jam the brakes on, but it didn't seem to work at all (maybe it doesn't
affect SATA?).  I tried splitting the RAID so it only dealt with one
drive; made no difference.

The -CURRENT build was from identical sources to those currently
sitting on this machine, so I can supply $Id$'s if it'll help.  Sadly,
the system needed to be running, so it's not available for further
experimentation.  It ran flawlessly with that Nov '05 -CURRENT, and is
now running flawlessly on RELENG_6.


-- 
Matthew Fuller     (MF4839)   |  fullermd at over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.


More information about the freebsd-current mailing list