Problems replacing failing drive in ZFS pool

Tue Jul 20 02:07:32 UTC 2010

On 7/19/2010 12:15 PM, Freddie Cash wrote:
> On Mon, Jul 19, 2010 at 8:56 AM, Garrett Moore<garrettmoore at gmail.com>  wrote:
>> So you think it's because when I switch from the old disk to the new disk,
>> ZFS doesn't realize the disk has changed, and thinks the data is just
>> corrupt now? Even if that happens, shouldn't the pool still be available,
>> since it's RAIDZ1 and only one disk has gone away?
>
> I think it's because you pull the old drive, boot with the new drive,
> the controller re-numbers all the devices (ie da3 is now da2, da2 is
> now da1, da1 is now da0, da0 is now da6, etc), and ZFS thinks that all
> the drives have changed, thus corrupting the pool.  I've had this
> happen on our storage servers a couple of times before I started using
> glabel(8) on all our drives (dead drive on RAID controller, remove
> drive, reboot for whatever reason, all device nodes are renumbered,
> everything goes kablooey).

Can you explain a bit about how you use glabel(8) in conjunction with 
ZFS?  If I can retrofit this into an exist ZFS array to make things 
easier in the future...

8.0-STABLE #0: Fri Mar  5 00:46:11 EST 2010

]# zpool status
   pool: storage
  state: ONLINE
  scrub: none requested
config:

         NAME        STATE     READ WRITE CKSUM
         storage     ONLINE       0     0     0
           raidz1    ONLINE       0     0     0
             ad8     ONLINE       0     0     0
             ad10    ONLINE       0     0     0
             ad12    ONLINE       0     0     0
             ad14    ONLINE       0     0     0
             ad16    ONLINE       0     0     0

> Of course, always have good backups.  ;)

In my case, this ZFS array is the backup.  ;)

But I'm setting up a tape library, real soon now....

-- 
Dan Langille - http://langille.org/