ZFS - Unable to offline drive in raidz1 based pool

Kurt Touet ktouet at gmail.com
Sun Sep 20 22:22:55 UTC 2009


I am using ZFS pool based on a 4-drive raidz1 setup for storage.  I
believe that one of the drives is failing, and I'd like to
remove/replace it.  The drive has been causing some issues (such as
becoming non-responsive and hanging the system with timeouts), so I'd
like to offline it, and then run in degraded mode until I can grab a
new drive (tomorrow).  However, when I disconnected the drive (pulled
the plug, not using a zpool offline command), the following occurred:

        NAME        STATE     READ WRITE CKSUM
        storage     FAULTED       0     0     1
          raidz1    DEGRADED     0     0     0
            ad14    ONLINE       0     0     0
            ad6     UNAVAIL      0     0     0
            ad12    ONLINE       0     0     0
            ad4     ONLINE       0     0     0

Note: That's my recreation of the output... not the actual text.

At this point, I was unable to to do anything with the pool... and all
data was inaccessible.  Fortunately, the after sitting pulled for a
bit, I tried putting the failing drive back into the array, and it
booted properly.  Of course, I still want to replace it, but this is
what happens when I try to take it offline:

monolith# zpool status storage
  pool: storage
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        storage     ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            ad14    ONLINE       0     0     0
            ad6     ONLINE       0     0     0
            ad12    ONLINE       0     0     0
            ad4     ONLINE       0     0     0

errors: No known data errors
monolith# zpool offline storage ad6
cannot offline ad6: no valid replicas
monolith# uname -a
FreeBSD monolith 8.0-RC1 FreeBSD 8.0-RC1 #2 r197370: Sun Sep 20
15:32:08 CST 2009     k at monolith:/usr/obj/usr/src/sys/MONOLITH  amd64

If the array is online and healthy, why can't I simply offline a drive
and then replace it afterwards?  Any thoughts?   Also, how does a
degraded raidz1 array end up faulting the entire pool?

Thanks,
-kurt


More information about the freebsd-fs mailing list