gmirror or ata problem
Oliver Fromme
olli at lurza.secnetix.de
Fri Feb 2 20:20:00 UTC 2007
Fluffles wrote:
> Pawel Jakub Dawidek wrote:
> > Simon L. Nielsen wrote:
> > > Oliver Fromme wrote:
> > > > This is strange. gmirror just detached one of its disks
> > > > for no apparent reason. I've built a mirror consisting of
> > > > the components ad0 and ad1 (both SATA drives). It has
> > > > been running fine. This is RELENG_6 from 2006-12-20.
> > > >
> > > > Yesterday evening ad1 was detached. There is no other
> > > > error message logged on console or in the logs (i.e. no
> > > > I/O error such as a bad sector or anything). There was
> > > > no particularly high load at that time. In fact, the
> > > > machine had been under much higher load before, without
> > > > anything bad happening.
> > > >
> > > > This is from the logs:
> > > >
> > > > Jan 29 19:10:13 pluto -- MARK --
> > > > Jan 29 19:20:26 pluto kernel: ad1: FAILURE - device detached
> > > > Jan 29 19:20:26 pluto kernel: subdisk1: detached
> > > > Jan 29 19:20:26 pluto kernel: ad1: detached
> > > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot write metadata on ad1 (device=gm0, error=6).
> > > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6).
> > > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6).
> > > > Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Device gm0: provider ad1 disconnected.
> > > > Jan 29 19:50:13 pluto -- MARK --
> > > >
> > > I have seen similar problems on my graid3. I think it's simply the
> > > disk which stops responding to commands, or at least ata(4) can't talk
> > > to the disk anymore...
> > >
> > > I see it on:
> > >
> > > ad10: 305245MB <WDC WD3200SD-01KNB0 08.05J08> at ata5-master SATA150
> > > ad12: 305245MB <WDC WD3200SD-01KNB0 08.05J08> at ata6-master SATA150
> > > ad14: 305245MB <WDC WD3200YS-01PGB0 21.00M21> at ata7-master SATA150
> > >
> > > After a reboot everything seems fine again and my RAID is rebuilt.
> > >
> > > I don't know why it happens, but it sucks :-/. I'm running 7-CURRENT
> > > BTW.
> >
> > It seems that when gmirror/graid3 writes to more than one disk at a
> > time, this puts too much load on ata channel or something and ata
> > disconnects the disk. I don't really know how it works exactly, but
> > maybe some timeout should be increased in the ata code?
>
> My experiences are that even a single disk will timeout; 5 seconds is
> just not enough for the disk to spinup. Most disks will need 10 seconds
> at least.
In my case it has nothing to do with spin up / spin down.
I do not use ataidle, and the disks are running all the
time. They don't have to spin up.
So it must be something else causing the problems.
Best regards
Oliver
--
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606, USt-Id: DE204219783
Any opinions expressed in this message are personal to the author and may
not necessarily reflect the opinions of secnetix GmbH & Co KG in any way.
FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd
"C++ is over-complicated nonsense. And Bjorn Shoestrap's book
a danger to public health. I tried reading it once, I was in
recovery for months."
-- Cliff Sarginson
More information about the freebsd-geom
mailing list