ZFS: "Cannot replace a replacing drive"

Freddie Cash fjwcash at gmail.com
Fri Apr 30 01:40:45 UTC 2010


On Thu, Apr 29, 2010 at 6:06 PM, Wes Morgan <morganw at chemikals.org> wrote:

> On Wed, 28 Apr 2010, Freddie Cash wrote:
>
> > Going through the archives, I see that others have run into this issue,
> and
> > managed to solve it via "zpool detach".  However, looking closely at the
> > archived messages, all the successful tests had one thing in common:  1
> > drive ONLINE, 1 drive FAULTED.  If a drive is online, obviously it can be
> > detached.  In all the cases where people have been unsuccessful at fixing
> > this situation, 1 drive is OFFLINE, and 1 drive is FAULTED.  As is our
> case:
> >
>
> What happened to the drive to fault it?
>
> Am in the process of replacing 500 GB drives with 1.5 TB drives, to
increase the available storage space in the pool (process went flawlessly on
the other storage server).  First 3 disks in the vdev replaced without
issues.

4th disk turned out to be a dud.  Nothing but timeouts and read/write errors
during the replace.  So I popped it out, put in a different 1.5 TB drive,
glabel'd it with the same name ... and the pool went "boom".

Now I'm stuck with a "label/disk04" device that can't be replaced, can't be
offlined, can't be detached.

Tried exporting the pool, importing the pool, with and without the disk in
the system.  All kinds of variations on detach, online, offline, replace on
the old device, the new device, the UUIDs.

Nothing.

[Now I know, for the future, to stress-test a drive before putting it into
the pool.]

I'm really hoping there's a way to recover from this, but it doesn't look
like it.  Will probably have to destroy/recreate the pool next week, using
the 1.5 TB drives from the get-go.
-- 
Freddie Cash
fjwcash at gmail.com


More information about the freebsd-fs mailing list