Another case of the vanishing disk

Sun Mar 16 07:00:52 UTC 2014

Seek_Error_Rate, Hardware_ECC_Recovered, Raw_Read_Error_Rate are all
increasing steadily for all four disks. Does this have something to do
with the recent resilver of the disk or the ongoing scrub (16.5%
completed)?

On Sun, Mar 16, 2014 at 1:22 AM, Erich Dollansky <erich at alogt.com> wrote:
> Hi,
>
> On Sun, 16 Mar 2014 01:04:05 -0500
> cruxpot <cruxpot at gmail.com> wrote:
>
>> Back in December, it was the power supply. That was a cheap Rosewill
>> 300W PSU. The new is a Corsair CX500 (500W). The system basically just
>> has an old SCSI card and 4 Green Barracuda 2TB disks and a low end
>> pci-e video card and pci-e gigabit NIC. How can the PSU be the problem
>> since I replaced it and it's more than adequate?
>
> the power supply has to regulate the supplied voltages withing a given
> range. If this does not work, drives tend to have problems. Your
> problem will be that you do not have the tools to check for this.
>
> The problem is that it is a rare thing. It is as rare that four drives
> go together.
>
> Can you run the machine with another power supply to test? Store the
> SMART values of each disk when you start the test and compare after
> some time.
>
> Erich
>>
>> On Sun, Mar 16, 2014 at 12:43 AM, Erich Dollansky
>> <erichsfreebsdlist at alogt.com> wrote:
>> > Hi,
>> >
>> > On Sun, 16 Mar 2014 00:28:31 -0500
>> > cruxpot <cruxpot at gmail.com> wrote:
>> >
>> >> All four disks have similar smartctl stats as far as those alarms
>> >> go. Are you trying to tell me that all four of my disks are about
>> >> to die? The sudden crashes have already been happening.
>> >
>> > it also could a problem with the motherboard or power supply. It is
>> > only hard to believe that a problem from the motherboard affects raw
>> > error rate. It is a bit more likely that your power supply is just
>> > on its limits and small drops in the 5/12V supply lines cause the
>> > problem.
>> >
>> > Erich
>> >>
>> >> On Sun, Mar 16, 2014 at 12:09 AM, Erich Dollansky
>> >> <erichsfreebsdlist at alogt.com> wrote:
>> >> > Hi,
>> >> >
>> >> > get a new disk as fast as possible.
>> >> >
>> >> > On Sat, 15 Mar 2014 23:48:58 -0500
>> >> > cruxpot <cruxpot at gmail.com> wrote:
>> >> >
>> >> >> messages:Mar 13 03:03:11 bsdbox kernel: ata4: port is not ready
>> >> >> (timeout 15000ms) tfd = 0000ffff
>> >> >
>> >> > First alarm bell is on.
>> >> >
>> >> >> UPDATED  WHEN_FAILED RAW_VALUE
>> >> >>   1 Raw_Read_Error_Rate     0x000f   100   099   006    Pre-fail
>> >> >> Always       -       1476032
>> >> >
>> >> > Second alarm bell.
>> >> >
>> >> >>   7 Seek_Error_Rate         0x000f   078   060   030    Pre-fail
>> >> >> Always       -       64570250
>> >> >
>> >> > Third alarm bell.
>> >> >
>> >> >>   9 Power_On_Hours          0x0032   077   077   000    Old_age
>> >> >> Always       -       20524
>> >> >
>> >> > Warranty should be still on then.
>> >> >
>> >> >> 188 Command_Timeout         0x0032   100   097   000    Old_age
>> >> >> Always       -       50
>> >> >
>> >> > Fourth alarm bell.
>> >> >
>> >> >> 195 Hardware_ECC_Recovered  0x001a   037   004   000    Old_age
>> >> >> Always       -       1476032
>> >> >
>> >> > I think I cannot count that far.
>> >> >
>> >> > A disk with raw errors is not dead yet but it is a clear sign
>> >> > that something is wrong. Be prepared for a sudden crash.
>> >> >
>> >> > Erich
>> >> _______________________________________________
>> >> freebsd-questions at freebsd.org mailing list
>> >> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
>> >> To unsubscribe, send any mail to
>> >> "freebsd-questions-unsubscribe at freebsd.org"
>> >
>> _______________________________________________
>> freebsd-questions at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
>> To unsubscribe, send any mail to
>> "freebsd-questions-unsubscribe at freebsd.org"
>