gmirror or ata problem

Harald Schmalzbauer harry at schmalzbauer.de
Thu Feb 15 09:35:42 UTC 2007


Am Dienstag, 30. Januar 2007 09:54 schrieb Oliver Fromme:
> Hi,
>
> This is strange.  gmirror just detached one of its disks
> for no apparent reason.  I've built a mirror consisting of
> the components ad0 and ad1 (both SATA drives).  It has
> been running fine.  This is RELENG_6 from 2006-12-20.
>
> Yesterday evening ad1 was detached.  There is no other
> error message logged on console or in the logs (i.e. no
> I/O error such as a bad sector or anything).  There was
> no particularly high load at that time.  In fact, the
> machine had been under much higher load before, without
> anything bad happening.
>
> This is from the logs:
>
> Jan 29 19:10:13 pluto -- MARK --
> Jan 29 19:20:26 pluto kernel: ad1: FAILURE - device detached
> Jan 29 19:20:26 pluto kernel: subdisk1: detached
> Jan 29 19:20:26 pluto kernel: ad1: detached
> Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot write metadata on ad1
> (device=gm0, error=6). Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot
> update metadata on disk ad1 (error=6). Jan 29 19:20:26 pluto kernel:
> GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6). Jan 29 19:20:26
> pluto kernel: GEOM_MIRROR: Device gm0: provider ad1 disconnected. Jan 29
> 19:50:13 pluto -- MARK --
>
> This almost looks like typical Windows problems:  Something
> reports a "failure", but no reason or any other useful
> information.  :-(
>
> "atacontrol list" reports for ad1::
>
>     Master:      no device present
>
> After an atacontrol detach/attach cycle, the device is back
> again:
>
>     Master:  ad1 <SAMSUNG HD160JJ/WU100-41> Serial ATA II
>
> I inserted it back into the gmirror, and right now it's
> synchronizing happily.
>
> Can anybody please explain what happened, and -- more
> importantly -- how to avoid it in the future?  As far as
> I can tell, the disk drives are perfectly OK.

I think this is a problem when the internal thermal recalibration takes too 
long.
Consumer HDDs can be "offline" quiet some time, I don't have numbers handy, 
but see Western Digitals explanation on their SATA RE (RaidEdition) Drives. 
Again, no link handy, sorry.

-Harry


More information about the freebsd-stable mailing list