Another case of the vanishing disk
cruxpot
cruxpot at gmail.com
Sun Mar 16 08:52:28 UTC 2014
I moved the power cable to plug into a surge protector and not the
UPS. Still have the same problem. Every second I see new seek error
rate messages, some drivers report more at a time than others but all
4 are doing it.
# smartctl -a /dev/ada2 | egrep 'Error|ECC'
Error logging capability: (0x01) Error logging supported.
1 Raw_Read_Error_Rate 0x000f 110 099 006 Pre-fail
Always - 15160
7 Seek_Error_Rate 0x000f 078 060 030 Pre-fail
Always - 67260695
184 End-to-End_Error 0x0032 100 100 099 Old_age
Always - 0
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age
Always - 0
195 Hardware_ECC_Recovered 0x001a 037 004 000 Old_age
Always - 15160
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always - 0
SMART Error Log Version: 1
No Errors Logged
# smartctl -a /dev/ada2 | egrep 'Error|ECC'
Error logging capability: (0x01) Error logging supported.
1 Raw_Read_Error_Rate 0x000f 110 099 006 Pre-fail
Always - 15160
7 Seek_Error_Rate 0x000f 078 060 030 Pre-fail
Always - 67260696
184 End-to-End_Error 0x0032 100 100 099 Old_age
Always - 0
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age
Always - 0
195 Hardware_ECC_Recovered 0x001a 037 004 000 Old_age
Always - 15160
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always - 0
SMART Error Log Version: 1
No Errors Logged
# smartctl -a /dev/ada2 | egrep 'Error|ECC'
Error logging capability: (0x01) Error logging supported.
1 Raw_Read_Error_Rate 0x000f 110 099 006 Pre-fail
Always - 15160
7 Seek_Error_Rate 0x000f 078 060 030 Pre-fail
Always - 67260697
184 End-to-End_Error 0x0032 100 100 099 Old_age
Always - 0
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age
Always - 0
195 Hardware_ECC_Recovered 0x001a 037 004 000 Old_age
Always - 15160
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always - 0
I will be stunned if it's yet another bad power supply, but I will
have to find another one somewhere and test this again. The drives are
all still under warranty.
On Sun, Mar 16, 2014 at 2:42 AM, cruxpot <cruxpot at gmail.com> wrote:
> It's an active PFC PSU plugged into an UPS which is not. Maybe that is
> the problem. I will try isolating some things tomorrow after the scrub
> has completed to see if I can get the errors to stop incrementing.
>
> On Sun, Mar 16, 2014 at 2:18 AM, Erich Dollansky
> <erichsfreebsdlist at alogt.com> wrote:
>> Hi,
>>
>> On Sun, 16 Mar 2014 02:00:51 -0500
>> cruxpot <cruxpot at gmail.com> wrote:
>>
>>> Seek_Error_Rate, Hardware_ECC_Recovered, Raw_Read_Error_Rate are all
>>> increasing steadily for all four disks. Does this have something to do
>>> with the recent resilver of the disk or the ongoing scrub (16.5%
>>> completed)?
>>>
>> the seek error rate could be linked to a failing power supply. The rest
>> should be just internal to the drive. Of course, also here a failing
>> power supply can be the cause.
>>
>> Can you put the drives into another machine?
>>
>> You must try to isolate the problem. It is a hardware problem on some
>> level. You must find out what it could be.
>>
>> Or just run a single disk on plain UFS. And connect it to some other
>> plug. And disconnect all other drives.
>>
>> Erich
More information about the freebsd-questions
mailing list