ZFS - Unable to offline drive in raidz1 based pool
Kurt Touet
ktouet at gmail.com
Mon Sep 28 17:29:20 UTC 2009
I've run into a similar experience again with my zfs raidz1 array
reporting itself as healthy when it's not. This, again, was after
some drive spin_retry_count errors (and a power cycle when unable to
shutdown -h). The pattern goes as follows:
1) A hard drive in the zfs array (for whatever reason) repeatedly
times out.. in this case, generating spin_retry_count errors in the
smart status.
2) The box is semi-frozen because it cannot deal with activity on the
zfs array, so it won't gracefully shutdown -h now.
3) The box is power cycled.
4) Everything spins up fine on the box, the array is now accessible.
5) zpool status - shows the array as online with no degraded status
6) zpool scrub - shows the drives to be desynced and resilvers a couple of them
7) presumably, everything is fine
monolith# zpool status
pool: storage
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz1 ONLINE 0 0 0
ad14 ONLINE 0 0 0
ad6 ONLINE 0 0 0
ad12 ONLINE 0 0 0
ad4 ONLINE 0 0 0
spares
ad22 AVAIL
errors: No known data errors
monolith# zpool scrub storage
monolith# zpool status
pool: storage
state: ONLINE
scrub: resilver completed after 0h0m with 0 errors on Mon Sep 28 11:17:05 2009
config:
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz1 ONLINE 0 0 0
ad14 ONLINE 0 0 0 1.17M resilvered
ad6 ONLINE 0 0 0 1.50K resilvered
ad12 ONLINE 0 0 0 2K resilvered
ad4 ONLINE 0 0 0 2K resilvered
spares
ad22 AVAIL
errors: No known data errors
So, my question still stands.. how does zfs upon scrubbing, instantly
know that the drives need to be resilvered (it completes in a few
seconds), but previous declares the array to be fine with no known
date errors?
Cheers,
-kurt
More information about the freebsd-fs
mailing list