kern/167272: ZFS Disks reordering causes ZFS to pick the wrong drive
David Alves
david.alves at gmx.fr
Tue Apr 24 18:50:10 UTC 2012
>Number: 167272
>Category: kern
>Synopsis: ZFS Disks reordering causes ZFS to pick the wrong drive
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: change-request
>Submitter-Id: current-users
>Arrival-Date: Tue Apr 24 18:50:10 UTC 2012
>Closed-Date:
>Last-Modified:
>Originator: David Alves
>Release: 8.2-RELEASE
>Organization:
>Environment:
FreeBSD xxxxxxxx 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Thu Feb 17 02:41:51 UTC 2011 root at mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64
>Description:
Hello,
ZFS shows the disks labels when invoking zpool status, if a disk was removed (physically) and the server rebooted, a reordering of the disks is done, and it appears that the old label is used by a valid disk ( the slot of the removed disks don't contains any new disks)
ZFS reports it as follows :
raidz2 DEGRADED 0 0 0
da16 ONLINE 0 0 0
da17 ONLINE 0 0 0
da18 ONLINE 0 0 0
da19 ONLINE 0 0 0
da20 ONLINE 0 0 0
da21 OFFLINE 0 0 0
da21 ONLINE 0 0 0
da22 ONLINE 0 0 0
raidz2 DEGRADED 0 0 0
da23 ONLINE 0 0 0
da24 ONLINE 0 0 0
da25 ONLINE 0 0 0
da26 ONLINE 0 0 0
da27 ONLINE 0 0 0
da27 OFFLINE 0 0 0
da29 ONLINE 0 0 0
da30 ONLINE 0 0 0
Notice the da21 and da27 drives.
the old disks da21/da27 are shown offline (because they were offlined and removed) but the reordering as assigned those labels to others running drives.
The problem is when performing a "zpool replace", "zpool replace" will pick the first label when attempting to replace a disk
example when replacing da21:
It picked up the da21 offline drive to replace because it was the first on the list.
raidz2 DEGRADED 0 0 0
da16 ONLINE 0 0 0
da17 ONLINE 0 0 0
da18 ONLINE 0 0 0
da19 ONLINE 0 0 0
da20 ONLINE 0 0 0
replacing DEGRADED 0 0 0
da21 OFFLINE 0 0 0
da31 ONLINE 0 0 0 37.1G resilvered
da21 ONLINE 0 0 0
da22 ONLINE 0 0 1 512 resilvered
raidz2 DEGRADED 0 0 0
da23 ONLINE 0 0 0
da24 ONLINE 0 0 0
da25 ONLINE 0 0 0
da26 ONLINE 0 0 0
da27 ONLINE 0 0 0
da27 OFFLINE 0 0 0
da29 ONLINE 0 0 0
da30 ONLINE 0 0 0
example when replacing da27:
It picked up the da27 online drive to replace because it was the first on the list.
raidz2 ONLINE 0 0 0
da16 ONLINE 0 0 0
da17 ONLINE 0 0 0
da18 ONLINE 0 0 0
da19 ONLINE 0 0 0
da20 ONLINE 0 0 0
da31 ONLINE 0 0 0
da21 ONLINE 0 0 0
da22 ONLINE 0 0 1
raidz2 DEGRADED 0 0 0
da23 ONLINE 0 0 0
da24 ONLINE 0 0 0
da25 ONLINE 0 0 0
da26 ONLINE 0 0 0
replacing ONLINE 0 0 0
da27 ONLINE 0 0 0
da28 ONLINE 0 0 0 80.5G resilvered
da27 OFFLINE 0 0 0
da29 ONLINE 0 0 0
da30 ONLINE 0 0 0
That would be nice if we can choose exactly what drive from the pool we are going to replace.
Thanks you.
>How-To-Repeat:
To repeat the problem:
offline a drive
remove the drive
reboot
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list