raidz: error during resilver: what next?
Scott Johnson
scottj75074 at yahoo.com
Fri Sep 10 20:35:18 UTC 2010
Hi all,
I'm running 8.1-RELEASE on amd64. I'm upgrading a 4-disk raidz to faster drives.
While resilvering drive #3, I hit errors on drive #4.
The old drives are ada{1,2,3,4}. New drives are label/hitachi{1,2,3,4}.
While resilvering drive 3, I got timeouts on ada4 that caused `zpool status` to
hang forever. Even `shutdown -r` did not reboot. Hard reset was required. dmesg
was full of ada4 timeouts, the most recent of which was several hours old.
After power cycle, ada4 appeared fine, reporting no recent smart errors.
My current zpool status:
pool: tank
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
raidz1 DEGRADED 0 0 0
label/hitachi1 ONLINE 0 0 0
label/hitachi2 ONLINE 0 0 0
replacing DEGRADED 0 0 1
ada3 OFFLINE 0 283 0
label/hitachi3 ONLINE 0 0 0
ada4 ONLINE 0 0 10
The resilvering did not complete, and did not automatically continue after the
power cycle.
Where do I go from here? I see two choices:
1. Continue the resilver and see what happens. (How do I restart the resilver?)
2. Cancel the replacement. Pull hitachi3 and put back in the (still good) ada3.
What commands do I use? My guess:
zpool offline tank label/hitachi3
<replace disk>
zpool online tank ada3
Thanks in advance for your help.
[In other news, while ada4 was timed out, booting from my other zfs pool zroot
failed with all kinds of bizarre missing file errors. It seems that one ZFS pool
being degraded/corrupted has affected the other ZFS pool too. I now regret going
to the trouble of zfs boot.]
More information about the freebsd-fs
mailing list