gmirror or ata problem

R. B. Riddick arne_woerner at yahoo.com
Tue Jan 30 09:02:47 UTC 2007


Hi!

--- Oliver Fromme <olli at lurza.secnetix.de> wrote:
> This is strange.  gmirror just detached one of its disks
> for no apparent reason.  I've built a mirror consisting of
> the components ad0 and ad1 (both SATA drives).  It has
> been running fine.  This is RELENG_6 from 2006-12-20.
> 
> Yesterday evening ad1 was detached.  There is no other
> error message logged on console or in the logs (i.e. no
> I/O error such as a bad sector or anything).  There was
> no particularly high load at that time.  In fact, the
> machine had been under much higher load before, without
> anything bad happening.
> 
> This is from the logs:
> 
> Jan 29 19:10:13 pluto -- MARK --
> Jan 29 19:20:26 pluto kernel: ad1: FAILURE - device detached
> Jan 29 19:20:26 pluto kernel: subdisk1: detached
> Jan 29 19:20:26 pluto kernel: ad1: detached
> Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot write metadata on ad1
> (device=gm0, error=6).
> Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1
> (error=6).
> Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1
> (error=6).
> Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Device gm0: provider ad1
> disconnected.
> Jan 29 19:50:13 pluto -- MARK --
>
My theory is:
1. Ur ad1 disk when to bed like others in ur time zone...
2. Then gmirror tried to write meta data, which woke up the disk
3. BUT: The disk was too slow, so ata_disk.c decided to detach the disk without
another try.
4. Then gmirror complained about its unability to write meta data.

Remember: Meta data is written from time to time by gmirror, because it likes
to mark the mirror clean/dirty depending on the write requests...

Remark: I think, etc at fluffles.net reported that some weeks ago...

> This almost looks like typical Windows problems:  Something
> reports a "failure", but no reason or any other useful
> information.  :-(
>
Ooch... That was mean... :-)

> "atacontrol list" reports for ad1::
> 
>     Master:      no device present
>
This looks like that bug, etc at fluffles.net reported...

It helped her box to increase some timeout from 5 sec to 15sec...
Maybe this is a mission for sos@ ?

> After an atacontrol detach/attach cycle, the device is back
> again:
> 
>     Master:  ad1 <SAMSUNG HD160JJ/WU100-41> Serial ATA II
>
Lucky u! :)

> I inserted it back into the gmirror, and right now it's
> synchronizing happily.
>
:-)

-Arne


 
____________________________________________________________________________________
Don't pick lemons.
See all the new 2007 cars at Yahoo! Autos.
http://autos.yahoo.com/new_cars.html 


More information about the freebsd-geom mailing list