ATA problems again ...

Miroslav Lachman 000.fbsd at quip.cz
Fri Jul 28 13:37:13 UTC 2006


Johan Ström wrote:
[...]
> On 17 jul 2006, at 17.40, Miroslav Lachman wrote:
> 
>> Mike Tancsa wrote:
>> [..]
>>
>>> Install the smartmontools from
>>> /usr/ports/sysutils/smartmontools/
>>> and post the output of
>>> smartctl -a /dev/ad8
>>
>>
>> smartmontools was previously installed and running as daemon  without 
>> any bad reports.
>> I can not run "smartctl -a /dev/ad8" now, because my server housing  
>> provider replaced HDD with the new one and after an hour of  
>> synchronization "ad8: FAILURE - device detached". So provider  
>> replaced whole server, only ad4 is original piece of HW.
>> On new server synchronization was much faster then in previous  server 
>> (1:30 hour compared to 5 hours in previous server) - so I  think it 
>> was HW problem.
>> Now I am running stresstest with copying /usr/ports to another  
>> partition in infinite loop.
>> I will post results later. (On bad server, test failed after about  30 
>> minutes. On another server the test is running fine second day,  so I 
>> think if disk will not fail after 1 day, problem is solved)
>>
>> At last - now I think this was not GEOM/gmirror related. I tried  
>> remove ad8 provider from gmirror (gm0), boot up system from gm0  with 
>> one provider (ad4) and test ad8 mounted separately - ad8  failed again.
> 
> 
> Just got another one..
> 
> Jul 25 13:30:47 elfi kernel: ad4: FAILURE - device detached
> Jul 25 13:30:47 elfi kernel: subdisk4: detached
> Jul 25 13:30:47 elfi kernel: ad4: detached
> Jul 25 13:30:47 elfi kernel: GEOM_MIRROR: Device gm0s1: provider  ad4s1 
> disconnected.
> Jul 25 13:30:47 elfi kernel: g_vfs_done():mirror/gm0s1f[READ 
> (offset=46318008320, length=2048)]error = 6
> Jul 25 13:30:47 elfi kernel: g_vfs_done():mirror/gm0s1f[READ 
> (offset=77269614592, length=16384)]error = 6
> 
> 6 days uptime when this occured... Both disks are tested with  PowerMax 
> without a single problem (same with smartctl), both SATA  cables are 
> new. So the only hwproblem that I cant rule out would be  the mobo, but 
> that is quite new too...
> 
> Solutions? Try RELENG_6 as recommended earlier?

In my case, server (mobo) replacement solved the problem. In this time, 
I got same problem on the second server. :(
You can try BIOS update first, then RELENG_6 (I do not thing it helps), 
at last - replace mobo.

Please, send me info, if BIOS update solved your problem.

Miroslav Lachman


More information about the freebsd-stable mailing list