gmirror Cannot add disk ad5 to gm0 (error=22)
Miroslav Lachman
000.fbsd at quip.cz
Wed Aug 2 22:28:11 UTC 2006
Rick C. Petty wrote:
> On Wed, Aug 02, 2006 at 10:37:49PM +0200, Miroslav Lachman wrote:
>
>>>Did you have SMART enabled in the BIOS?
>>
>>Yes, (as I remember - I have only remote access now) and have
>
>
> Then I doubt the disk itself had any errors.. Likely a bad cable or
> controller, which I've typically seen manifested under heavier disk
> activity.
[...]
> Yup, disks disappear when they stop responding to "bus reset" commands.
> This seems to happen on various controllers after an unpredictable number
> of READ_DMA or WRITE_DMA timeout errors. Theoretically, you could reinit
> the channel and see if the disk pops back up.
Reinit did not help, only reboot.
> One thing to note: I
> recommend putting the disks on separate channels so if a reinit fails, you
> don't lose both disks. I hate it when manufacturers put two SATA ports on
> the same ATA channel.. Cheap for them, problematic for you.
I dont understand hardware much, but SATA controller is set to IDE mode
in BIOS and disks are on ATA channel 2 as ad4 Master and ad5 Slave. If
BIOS settings is changed to AHCI, dmesg shows two more ATA channels, ad4
as ata2-master and second disk will be ad8 on ata4-master (without
changing cables / connections). As I see same problem with disk
disappearing with AHCI and IDE, I have decided to use IDE mode, which
seems to me little bit faster in gmirror synchronization.
Is there big difference between AHCI and IDE mode of SATA controller?
As I see in dmesg, controller is Intel ICH7 *SATA300* but disks are
SATA150, I this cause some troubles?
>>>>Can anybody tell me, where is the problem / how can I found what is wrong?
>>>
>>>
>>>What's the output of "gmirror status" ?? I suspect on a reboot, gmirror
>>>will try to synchronize ad4 to ad5 (since ad5 was the first to drop). Once
>>>that is complete, gmirror won't be DEGRADED anymore.
>>
>># gmirror status
>> Name Status Components
>>mirror/gm0 DEGRADED ad4
>
>
> Hmm, and is ad5 detected? (rhetorical question, because I see that it was)
>
>
>>Gmirror is not synchronized after reboot:
>>
>>Aug 1 09:14:50 track kernel: GEOM_MIRROR: Device gm0: provider ad5
>>detected.
>>Aug 1 09:14:50 track kernel: GEOM_MIRROR: Component ad5 (device gm0)
>>broken, skipping.
>
>
> Looks like the disk was marked with bad metadata.
>
>
>>So disk is OK, but gmirror refused to use it?
>
>
> Yes. I would first suggest trying "gmirror deactivate -v gm0 ad5" then
> trying to reactivate it. Maybe that will flush out the wrong metadata.
> If that doesn't work, try booting in verbose mode and attaching the dmesg
> (in particular, when the mirror is being attached).
> Last resort (although not a horrible option), you can try removing ad5 from
> the mirror and relabelling (gmirror label, not bsdlabel) it. If the remove
> fails, use a combination of forget and clear.
gmirror forget and insert helped:
root at track ~/# gmirror deactivate -v gm0 ad5
No such provider: ad5.
root at track ~/# gmirror forget -v gm0
Done.
root at track ~/# gmirror insert -v gm0 ad5
Done.
root at track ~/# gmirror status
Name Status Components
mirror/gm0 DEGRADED ad4
ad5 (0%)
>>If disks are OK, what is wrong? What caused READ / WRITE timeouts?
>>Broken SATA controler? FreeBSD ATA driver?
>
>
> Try replacing the cables, trying a different SATA controller. I've seen
> these timeouts *a lot* and usually my gmirror/gvinum partitions all
> survive (after reboot at least). There are a lot of threads on this and
> other mailing lists describing the timeout problems.
Yes, I read many post about similar problems. I have similar problem on
4 machines, so I think this is not cable problem. Maybe bad controller
in whole serie of ASUS RS120, or something like this. (4 of 4 same
machines has similar problems with disk subsystem)
Thank you.
Miroslav Lachman
More information about the freebsd-geom
mailing list