Weird problem with gmirror - cannot add the Good disk when previously failed SATA disk is online

Achilleas Mantzios achill at matrix.gatewaynet.com
Mon May 18 10:49:48 UTC 2009


Hey Manoli! glad to see you again,

Στις Monday 18 May 2009 13:27:58 ο/η Manolis Kiagias έγραψε:
> Achilleas Mantzios wrote:
> > Hello,
> > in advance sorry for the cross posting, it is just that freebsd-geom didnt seem that populated.
> > I run 7.1-PRERELEASE, its a home server.
> > today morning after a power failure, the rebuild my root gm0 failed on disk ad4.
> > The messages were:
> >
> > May 18 08:02:02 panix kernel: ad4: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=268091264
> > May 18 08:02:08 panix kernel: drm0: <Intel i865G GMCH> on vgapci0
> > May 18 08:02:08 panix kernel: info: [drm] AGP at 0xf0000000 128MB
> > May 18 08:02:08 panix kernel: info: [drm] Initialized i915 1.5.0 20060119
> > May 18 08:02:08 panix kernel: drm0: [ITHREAD]
> > May 18 08:02:08 panix kernel: ad4: FAILURE - device detached
> > May 18 08:02:08 panix kernel: subdisk4: detached
> > May 18 08:02:08 panix kernel: ad4: detached
> > May 18 08:02:08 panix kernel: GEOM_MIRROR: Device gm0: provider ad4 disconnected.
> > May 18 08:02:08 panix kernel: GEOM_MIRROR: Device gm0: rebuilding provider ad4 stopped.
> >
> >   
> 
> It looks to me you got a bad disk now.
> 

I certainly hope so, since there is nothing else i can do

> > I read http://www.eztiger.org/2008/08/removing-and-re-adding-a-disk-in-gmirror/
> > hoping that the rebuld failure was temprary
> > and so i tried to just run 
> > # gmirror forget gm0
> > # gmirror insert gm0 ad4
> >
> > But the system responded (if i remember correctly)  
> > Unknown provider ad4.
> > The system no longer could see ad4 being online.
> >
> > So i rebooted the system many times and had these results:
> > -When having put offline ad4 (disconnected by hardware), the system booted ok.
> > -When having both disks online the system responded consistently 
> > with:
> > "GEOM_MIRROR: Cannot add disk ad6 to gm0 (error=22)."
> > Which IMO is not very ok, since gm0 should add ad6 without problem,
> > no matter if ad4 is online or not.
> > -When having only ad4 online, then it simply cannot find gm0 at all. (kind of reasonable)
> >
> > So my only option is to have only ad6 online, with a current gmirror status:
> > panix# gmirror status
> >       Name    Status  Components
> > mirror/gm0  COMPLETE  ad6
> >
> > Anyone has an idea of how should i proceed (besides buying a UPS unit!)
> > Is it meaningfull to go for a new Disk to replace current ad4?
> >   
> 
> I'd recommend attaching the bad disk on its own to a system and perform
> tests on it. Is the BIOS recognizing this properly? I would run hardware

Yes, the BIOS recognizes it ok i suppose.

> tests on it - either manufacturer ones, or stuff like
> sysutils/smartmontools. You could also try  installing FreeBSD on it and
> see if it works.  And probably use dd to clean all the contents, esp.
> the partition table and the last sector where geom information is stored.
> 

Thanx, lacking time i think i will try to use a brand new identical disk.

> > Why is the presence of the supposed bad disk ad4, affecting gm0,
> > when having already told gm0 to forget about ad4?
> >   
> 
> The bad disk may be sending confusing signals to the bus / IDE
> interface. I've had this once (although it was due to a bad cable). The
> entire mirror would disappear suddenly.
> 
> 



-- 
Achilleas Mantzios


More information about the freebsd-questions mailing list