ZFS scrub/selfheal not really working

Kip Macy kmacy at freebsd.org
Thu May 28 19:27:22 UTC 2009


As I commented earlier, fletcher2 is not that much better than the TCP
checksum. If you want to use ZFS as a means of salvaging problematic
hardware, crc32 would be more appropriate.

Cheers,
Kip

On Thu, May 28, 2009 at 6:26 AM, Dmitry Marakasov <amdmi3 at amdmi3.ru> wrote:
> * Andrew Snow (andrew at modulus.org) wrote:
>
>> > I've recently moved my ZFS pool to 6x1TB hitachi HDDs. However,
>> > those turned out to be quite crappy, and tend to grow unreadable
>> > sectors.  Those sectors are really nasty, cause though they are not
>> > readable, they won't be marked as bad and relocated until there's
>> > write failure. And write failure actually never happens - if the sector
>> > is rewritten it's pervectly readable again.
>>
>> It seems like its a good idea to chuck out the whole lot, after first
>> double-checking or replacing your controller, cabling, and power supply.
>
> Yes, that's in plans. The box also reboots sometimes, loosing one of
> HDDs from raid (until next power cycyle). I suspect power supply.
>
> Anyway, it's a nice test for ZFS :)
>
>>   ZFS can't help you :-)
>
> No, actually in the current age of buggy hardware, ZFS is the only thing
> that can help :)
>
>> > So, my question is why doesn't ZFS rewrite those sectors with READ
>> > errors during scrub?
>>
>> Because of the transactional nature of ZFS it writes the fresh data in a
>> different part of the disk and then marks the old bad sectors as free.
>
> Ok, then why does read errors pop up again after scrub, while they
> should have been recovered?
>
> Actually, I've forgotten to look into logs, and they say that ZFS
> shrinks read block size (down to a sector size sometimes), so
> corrupted sectors likely _are_ used for data, and they don't seem
> to be recovered, while they should.
>
>> > there's no parity available, will it narrow down read block size to read
>> > the data and not the unused sectors with curruption?
>>
>> Correct.  If no parity is available it will try its best to read as much
>> data as possible and return read errors up to the application layer on
>> sector failure.
>
> Uh huh. That would be less worries if not the thing above.
>
> --
> Dmitry Marakasov   .   55B5 0596 FF1E 8D84 5F56  9510 D35A 80DD F9D2 F77D
> amdmi3 at amdmi3.ru  ..:  jabber: amdmi3 at jabber.ru    http://www.amdmi3.ru
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
>



-- 
When bad men combine, the good must associate; else they will fall one
by one, an unpitied sacrifice in a contemptible struggle.

    Edmund Burke


More information about the freebsd-fs mailing list