This diskfailure should not panic a system, but just disconnect disk from ZFS

Willem Jan Withagen wjw at digiware.nl
Mon Jun 22 15:50:55 UTC 2015


On 22/06/2015 03:49, Michelle Sullivan wrote:
> Quartz wrote:
>> Also:
>>
>>> And thus I'd would have expected that ZFS would disconnect /dev/da0 and
>>> then switch to DEGRADED state and continue, letting the operator fix the
>>> broken disk.
>>
>>> Next question to answer is why this WD RED on:
>>
>>> got hung, and nothing for this shows in SMART....
>>
>> You have a raidz2, which means THREE disks need to go down before the
>> pool is unwritable. The problem is most likely your controller or
>> power supply, not your disks.
>>
> Never make such assumptions...
> 
> I have worked in a professional environment where 9 of 12 disks failed
> within 24 hours of each other....  They were all supposed to be from
> different batches but due to an error they came from the same batch and
> the environment was so tightly controlled and the work-load was so
> similar that MTBF was almost identical on all 11 disks in the array...
> the only disk that lasted more than 2 weeks over the failure was the
> hotspare...!
> 

Scary (non)-statistics....
Theories are always nice, but this sort of experiences make your hair go
grey overnight.

--WjW


More information about the freebsd-fs mailing list