SATA drive 1 disappears

Volker volker at vwsoft.com
Tue Mar 7 10:43:16 UTC 2006


Dear list,

I've seen GEOM mirror error messages at two nearly identical
systems. Both are running on Asrock K7VT4xx (VIA chipset) boards and
having two SATA drives connected (Hitachi HDS728080PLA380/PF2OA60A).
On both systems we're using gmirror RAID-1 per slice.

After same weeks of productional use, on both systems the first disc
(ad4) within the RAID set came out with error messages like:

> +ad4: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=127199808
> +ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=10968959
> +ad4: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=10968959
> +ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=118404223
> +ad4: FAILURE - WRITE_DMA timed out LBA=10968959
> +GEOM_MIRROR: Request failed (error=5). ad4s1[WRITE(offset=5616074752, length=16384)]
> +GEOM_MIRROR: Device gm0s1: provider ad4s1 disconnected.
> +ad4: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=118404223
> +ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=122117983
> +ad4: FAILURE - WRITE_DMA timed out LBA=118404223
> ...
> +subdisk4: detached
> +ad4: detached
> +GEOM_MIRROR: Device gm0s2: provider ad4s2 disconnected.
> +GEOM_MIRROR: Request failed (error=5). ad4s2[READ(offset=8987662336, length=2048)]

After these messages the disc isn't seen by the system anymore:

> atacontrol list
> ATA channel 0:
>     Master: acd0 <NEC DVD RW ND-3540A/1.01> ATA/ATAPI revision 0
>     Slave:       no device present
> ATA channel 1:
>     Master:      no device present
>     Slave:       no device present
> ATA channel 2:
>     Master:      no device present
>     Slave:       no device present
> ATA channel 3:
>     Master:  ad6 <HDS728080PLA380/PF2OA60A> Serial ATA v1.0
>     Slave:       no device present


The (S)ATA controller and devices is being detected at startup as:
> +atapci0: <VIA 6420 SATA150 controller> port 
> +atapci1: <VIA 8237 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f 
> +ad4: 78533MB <HDS728080PLA380 PF2OA60A> at ata2-master SATA150
> +ad6: 78533MB <HDS728080PLA380 PF2OA60A> at ata3-master SATA150
> +GEOM_MIRROR: Device gm0s1 created (id=613166686).
> +GEOM_MIRROR: Device gm0s1: provider ad4s1 detected.
> +GEOM_MIRROR: Device gm0s2 created (id=91558579).
> +GEOM_MIRROR: Device gm0s2: provider ad4s2 detected.
> +GEOM_MIRROR: Device gm0s1: provider ad6s1 detected.
> +GEOM_MIRROR: Device gm0s1: provider ad6s1 activated.
> +GEOM_MIRROR: Device gm0s1: provider ad4s1 activated.
> +GEOM_MIRROR: Device gm0s1: provider mirror/gm0s1 launched.
> +GEOM_MIRROR: Device gm0s2: provider ad6s2 detected.
> +GEOM_MIRROR: Device gm0s2: provider ad6s2 activated.
> +GEOM_MIRROR: Device gm0s2: provider ad4s2 activated.
> +GEOM_MIRROR: Device gm0s2: provider mirror/gm0s2 launched.

The RAID set is now running degraded. Both systems are running on R
6.0. I know it's more like guesswork, but what might be the reason
for these disc errors? Are the discs really dying? When rebooting
the system(s) the first disc re-appears for a few days and will
disappear again later. The hdu connectors have been checked.

Is there something wrong with gmirror, geom or the controller
driver? What makes me scratching my head is on both systems just the
first disc is dying. I've found postings from one year ago and the
conclusion was faulty hardware. Are there any signs for geom or
driver problems?

`uname -a':
FreeBSD GwOsl 6.0-RELEASE FreeBSD 6.0-RELEASE #0: Wed
Nov 30 02:41:47 UTC 2005
root at gwosl:/usr/obj/usr/src/sys/GwOsl  i386

Greetings,

Volker


More information about the freebsd-stable mailing list