Is there a "disconnected" state for geom_mirror providers?

Paul Mather paul at gromit.dlib.vt.edu
Sat Apr 23 19:13:34 PDT 2005


Sadly, the "TIMEOUT - WRITE_DMA"-induced disk disconnections have
returned on my -CURRENT system since I upgraded to ATA Mk.III. :-(
However, I've noticed that when a drive is marked as failed and the
device detached, the provider also disappears from the geom_mirror it is
part of, instead of being marked as a "stale" or "disconnected" or
"missing" component of the remaining mirror components.  Is this the
correct behaviour?

In the latest failure to occur, ad0 was detached:

ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=49981679
ad0: FAILURE - device detached
subdisk0: detached
ad0: detached
GEOM_MIRROR: Cannot update metadata on disk ad0 (error=5).
GEOM_MIRROR: Cannot update metadata on disk ad0 (error=6).
GEOM_MIRROR: Device raid1: provider ad0 disconnected.
GEOM_MIRROR: Request failed (error=6). ad0[WRITE(offset=3847741440, length=16384)]


I performed an "atacontrol detach 0" followed by an "atacontrol attach
0" to "re-discover" the "failed" ad0 as part of the existing
geom_mirror.  This yielded the following:

acd0: detached
(cd0:ata0:0:1:0): lost device
(cd0:ata0:0:1:0): removing device entry
atapicam0: detached
stray irq14
ad0: 24405MB <IBM DJNA-352500 J51OA30K> at ata0-master UDMA33
GEOM_MIRROR: Component ad0 (device raid1) broken, skipping.
GEOM_MIRROR: Cannot add disk ad0 to raid1 (error=22).
acd0: DVDR <LITE-ON DVDRW SOHW-832S/VS08> at ata0-slave UDMA33
cd0 at ata0 bus 0 target 1 lun 0
cd0: <LITE-ON DVDRW SOHW-832S VS08> Removable CD-ROM SCSI-0 device 
cd0: 33.000MB/s transfers
cd0: cd present [1 x 2048 byte records]


The provider ad0 did not show up as a "stale" provider of my "raid1"
mirror (from which it had disappeared when it was detached due to the
"TIMEOUT - WRITE_DMA" failure).  I had to do a "gmirror forget raid1"
before a "gmirror insert raid1 ad0" would allow me to re-insert it so I
could perform a "gmirror rebuild raid1 ad0" to kick off synchronisation.

What is the definition of a "broken" component?  What is the difference
between a "stale" and a "broken" component?

If I were to detach and remove a hot-plug geom_mirror component and
subsequently re-attach it, will the component be considered "stale" or
"broken?"

This is not a major inconvenience (well, the return of the "TIMEOUT -
WRITE_DMA" errors are:), but I was just wondering why my failed
providers disappear now as opposed to being marked as stale as happened
in the past.

BTW, my system is a fairly recent -CURRENT: FreeBSD 6.0-CURRENT #0: Mon
Apr 18 12:25:24 EDT 2005.

Cheers,

Paul.
-- 
e-mail: paul at gromit.dlib.vt.edu

"Without music to decorate it, time is just a bunch of boring production
 deadlines or dates by which bills must be paid."
        --- Frank Vincent Zappa


More information about the freebsd-geom mailing list