gmirror HD failure detection

Thu Sep 21 10:47:00 PDT 2006

On Thursday 21 September 2006 06:15, Alex Zbyslaw wrote:
> Robin Becker wrote:
> > Dave wrote:
> >> Hi,
> >>    I've got smartd going on a gmirror system, however when smartd
> >> starts up it says it can't find the various drives. I've tried both
> >> the autodetection line as well as specifying the individual drives.
> >> If this does work i'd like to know about it as i believe i might have
> >> one failing drive, but am not sure which one.
> >> Thanks.
> >> Dave.
> >
> > well as root I can certainly run smartctl -a /dev/ad4 (or /dev/ad6) so
> > I assume smartd could.
> >
> > I like the idea of using gmirror status -s , but I don't know what the
> > results would be if one of the disks were going bad. Would it change
> > from COMPLETE to DEGRADED suddenly?
>
> I would expect gmirror to report a problem when a disk gad *gone* bad.
> Going bad from a SMART point of view can mean, for example, too high a
> rate of read retries or too many bad sectors remapped.  At that point
> the drive is technically working, so there is nothing technically wrong
> with the array status.  In such a case SMART would just be telling you
> that the disk is likely to go kablooey soon; time for backups, new drive
> etc. etc.
>
> Something like gmirror status -s you can presumably run even every five
> minutes from cron; if you weed out the good results you'll only get
> email if something does go wrong.
>
> Use both approaches since they tell you different things which just
> happen some of the time to coincide.

If you happen to be one of the smart admins who actually reviews the output of 
the periodic scripts, then simply adding
	daily_status_gmirror_enable="YES"
to /etc/periodic.conf will give you a daily health check. If you want more 
granularity than a single day, you could use the contents of the periodic 
script as a starting point for rolling your own.

JN