kern/135050: ZFS clears/hides disk errors on reboot

Thomas Backman serenity at exscape.org
Fri May 29 08:10:05 UTC 2009


>Number:         135050
>Category:       kern
>Synopsis:       ZFS clears/hides disk errors on reboot
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri May 29 08:10:04 UTC 2009
>Closed-Date:
>Last-Modified:
>Originator:     Thomas Backman
>Release:        8.0-CURRENT, r192914
>Organization:
exscape
>Environment:
FreeBSD clone.exscape.org 8.0-CURRENT FreeBSD 8.0-CURRENT #4 r192914: Thu May 28 08:56:46 CEST 2009     root at clone.exscape.org:/usr/obj/usr/src/sys/DTRACE  amd64

>Description:
(Not sure if this is kern or bin, but I'll take a shot.)

When a disk is corrupted, "zpool status" hides the fact that there has ever been any corruption if the system is rebooted. In practice, this could lead to silent corruption (that is fixed by ZFS temporarily, while the disk is dying) without the using ever finding out. Quite bad. It should tell the user that there have been problems. zpool history -il shown nothing of interest either.
>How-To-Repeat:
(... create pool etc ...)
[root at clone ~]# dd if=/dev/random of=/dev/ad2 bs=1000k count=5 seek=30
[root at clone ~]# zpool scrub test

(... wait for a while ...)

[root at clone ~]# zpool status -v test
  pool: test
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
	using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: scrub completed after 0h2m with 0 errors on Fri May 29 09:52:50 2009
config:

	NAME        STATE     READ WRITE CKSUM
	test        ONLINE       0     0     0
	  raidz1    ONLINE       0     0     0
	    ad1     ONLINE       0     0     0
	    ad2     ONLINE       0     0    79  4.94M repaired
	    ad3     ONLINE       0     0     0

errors: No known data errors

[root at clone ~]# reboot

[root at clone ~]# zpool status -xv
all pools are healthy
[root at clone ~]# zpool status test
  pool: test
 state: ONLINE
 scrub: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	test        ONLINE       0     0     0
	  raidz1    ONLINE       0     0     0
	    ad1     ONLINE       0     0     0
	    ad2     ONLINE       0     0     0
	    ad3     ONLINE       0     0     0

errors: No known data errors
>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list