Is there a "disconnected" state for geom_mirror providers?
Paul Mather
paul at gromit.dlib.vt.edu
Sat Apr 23 19:13:34 PDT 2005
Sadly, the "TIMEOUT - WRITE_DMA"-induced disk disconnections have
returned on my -CURRENT system since I upgraded to ATA Mk.III. :-(
However, I've noticed that when a drive is marked as failed and the
device detached, the provider also disappears from the geom_mirror it is
part of, instead of being marked as a "stale" or "disconnected" or
"missing" component of the remaining mirror components. Is this the
correct behaviour?
In the latest failure to occur, ad0 was detached:
ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=49981679
ad0: FAILURE - device detached
subdisk0: detached
ad0: detached
GEOM_MIRROR: Cannot update metadata on disk ad0 (error=5).
GEOM_MIRROR: Cannot update metadata on disk ad0 (error=6).
GEOM_MIRROR: Device raid1: provider ad0 disconnected.
GEOM_MIRROR: Request failed (error=6). ad0[WRITE(offset=3847741440, length=16384)]
I performed an "atacontrol detach 0" followed by an "atacontrol attach
0" to "re-discover" the "failed" ad0 as part of the existing
geom_mirror. This yielded the following:
acd0: detached
(cd0:ata0:0:1:0): lost device
(cd0:ata0:0:1:0): removing device entry
atapicam0: detached
stray irq14
ad0: 24405MB <IBM DJNA-352500 J51OA30K> at ata0-master UDMA33
GEOM_MIRROR: Component ad0 (device raid1) broken, skipping.
GEOM_MIRROR: Cannot add disk ad0 to raid1 (error=22).
acd0: DVDR <LITE-ON DVDRW SOHW-832S/VS08> at ata0-slave UDMA33
cd0 at ata0 bus 0 target 1 lun 0
cd0: <LITE-ON DVDRW SOHW-832S VS08> Removable CD-ROM SCSI-0 device
cd0: 33.000MB/s transfers
cd0: cd present [1 x 2048 byte records]
The provider ad0 did not show up as a "stale" provider of my "raid1"
mirror (from which it had disappeared when it was detached due to the
"TIMEOUT - WRITE_DMA" failure). I had to do a "gmirror forget raid1"
before a "gmirror insert raid1 ad0" would allow me to re-insert it so I
could perform a "gmirror rebuild raid1 ad0" to kick off synchronisation.
What is the definition of a "broken" component? What is the difference
between a "stale" and a "broken" component?
If I were to detach and remove a hot-plug geom_mirror component and
subsequently re-attach it, will the component be considered "stale" or
"broken?"
This is not a major inconvenience (well, the return of the "TIMEOUT -
WRITE_DMA" errors are:), but I was just wondering why my failed
providers disappear now as opposed to being marked as stale as happened
in the past.
BTW, my system is a fairly recent -CURRENT: FreeBSD 6.0-CURRENT #0: Mon
Apr 18 12:25:24 EDT 2005.
Cheers,
Paul.
--
e-mail: paul at gromit.dlib.vt.edu
"Without music to decorate it, time is just a bunch of boring production
deadlines or dates by which bills must be paid."
--- Frank Vincent Zappa
More information about the freebsd-geom
mailing list