immense delayed write to file system (ZFS and UFS2),
performance issues
Gerrit Kühn
gerrit at pmp.uni-hannover.de
Tue Jan 26 13:57:24 UTC 2010
On Tue, 19 Jan 2010 03:24:49 -0800 Jeremy Chadwick
<freebsd at jdc.parodius.com> wrote about Re: immense delayed write to file
system (ZFS and UFS2), performance issues:
JC> So which drive models above are experiencing a continual increase in
JC> SMART attribute 193 (Load Cycle Count)? My guess is that some of the
JC> WD Caviar Green models, and possibly all of the RE2-GP and RE4-GP
JC> models are experiencing this problem.
Just to add some more info:
I contacted WD support about the problem with RE4 drives and received a
firmware update by email today which is supposed to fix the problem. Did
not try it yet, though.
I am still busy replacing RE2-disks with updated drives. I came across a
very strange thing with zfs. Actually I had the following pool layout:
mclane# zpool status
pool: tank
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz1 ONLINE 0 0 0
ad8 ONLINE 0 0 0
ad10 ONLINE 0 0 0
ad12 ONLINE 0 0 0
spares
ad14 AVAIL
errors: No known data errors
All disks still have the firmware bug, so I want to replace them with
disks that I already fixed. I put in a updated drive as ad18 and
wanted to replace ad12 to get the drive with the broken firmware out:
mclane# zpool replace tank /dev/ad12 /dev/ad18
mclane# zpool status
pool: tank
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress for 0h0m, 0.01% done, 52h51m to go
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz1 ONLINE 0 0 0
ad8 ONLINE 0 0 0 7.21M resilvered
ad10 ONLINE 0 0 0 7.22M resilvered
replacing ONLINE 0 0 0
ad12 ONLINE 0 0 0
ad18 ONLINE 0 0 0 10.7M resilvered
spares
ad14 AVAIL
errors: No known data errors
However, something must have gone wrong during the resilvering process and
it now looks like this:
mclane# zpool status
pool: tank
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are
unaffected. action: Determine if the device needs to be replaced, and
clear the errors using 'zpool clear' or replace the device with 'zpool
replace'. see: http://www.sun.com/msg/ZFS-8000-9P
scrub: resilver completed after 2h39m with 0 errors on Tue Jan 26
14:00:00 2010 config:
NAME STATE READ WRITE CKSUM
tank DEGRADED 0 0 0
raidz1 DEGRADED 0 0 0
ad8 ONLINE 0 0 0 975M resilvered
ad10 ONLINE 0 0 142 974M resilvered
replacing DEGRADED 0 7.25M 0
ad12 ONLINE 0 0 0
ad18 REMOVED 0 1 0 79.4M resilvered
spares
ad14 AVAIL
errors: No known data errors
What is going on here? ad18 obviously detached during the
process. /var/log/messages just gives me
Jan 26 11:23:33 mclane kernel: ad18: FAILURE - device detached
Additionally ad10 obviously produced chksum errors. What do I do about the
degraded replacing process? Can I terminate it somehow and maybe replace
ad10 first? Any other hints?
cu
Gerrit
More information about the freebsd-stable
mailing list