ATA problems again ...
Miroslav Lachman
000.fbsd at quip.cz
Fri Jul 28 13:37:13 UTC 2006
Johan Ström wrote:
[...]
> On 17 jul 2006, at 17.40, Miroslav Lachman wrote:
>
>> Mike Tancsa wrote:
>> [..]
>>
>>> Install the smartmontools from
>>> /usr/ports/sysutils/smartmontools/
>>> and post the output of
>>> smartctl -a /dev/ad8
>>
>>
>> smartmontools was previously installed and running as daemon without
>> any bad reports.
>> I can not run "smartctl -a /dev/ad8" now, because my server housing
>> provider replaced HDD with the new one and after an hour of
>> synchronization "ad8: FAILURE - device detached". So provider
>> replaced whole server, only ad4 is original piece of HW.
>> On new server synchronization was much faster then in previous server
>> (1:30 hour compared to 5 hours in previous server) - so I think it
>> was HW problem.
>> Now I am running stresstest with copying /usr/ports to another
>> partition in infinite loop.
>> I will post results later. (On bad server, test failed after about 30
>> minutes. On another server the test is running fine second day, so I
>> think if disk will not fail after 1 day, problem is solved)
>>
>> At last - now I think this was not GEOM/gmirror related. I tried
>> remove ad8 provider from gmirror (gm0), boot up system from gm0 with
>> one provider (ad4) and test ad8 mounted separately - ad8 failed again.
>
>
> Just got another one..
>
> Jul 25 13:30:47 elfi kernel: ad4: FAILURE - device detached
> Jul 25 13:30:47 elfi kernel: subdisk4: detached
> Jul 25 13:30:47 elfi kernel: ad4: detached
> Jul 25 13:30:47 elfi kernel: GEOM_MIRROR: Device gm0s1: provider ad4s1
> disconnected.
> Jul 25 13:30:47 elfi kernel: g_vfs_done():mirror/gm0s1f[READ
> (offset=46318008320, length=2048)]error = 6
> Jul 25 13:30:47 elfi kernel: g_vfs_done():mirror/gm0s1f[READ
> (offset=77269614592, length=16384)]error = 6
>
> 6 days uptime when this occured... Both disks are tested with PowerMax
> without a single problem (same with smartctl), both SATA cables are
> new. So the only hwproblem that I cant rule out would be the mobo, but
> that is quite new too...
>
> Solutions? Try RELENG_6 as recommended earlier?
In my case, server (mobo) replacement solved the problem. In this time,
I got same problem on the second server. :(
You can try BIOS update first, then RELENG_6 (I do not thing it helps),
at last - replace mobo.
Please, send me info, if BIOS update solved your problem.
Miroslav Lachman
More information about the freebsd-stable
mailing list