kern/175179: ZFS may attach wrong device on move

Steven Chamberlain steven at pyro.eu.org
Thu Jan 10 13:20:02 UTC 2013


>Number:         175179
>Category:       kern
>Synopsis:       ZFS may attach wrong device on move
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Jan 10 13:20:01 UTC 2013
>Closed-Date:
>Last-Modified:
>Originator:     Steven Chamberlain
>Release:        9.0-RELEASE-p5
>Organization:
>Environment:
GNU/kFreeBSD mail 9.0-2-amd64 #0 Sat Nov 24 04:44:27 UTC 2012 x86_64 amd64 AMD Athlon(tm) 64 Processor 3700+ GNU/kFreeBSD

(This was an issue seen originally seen on Debian GNU/kFreeBSD systems http://bugs.debian.org/651624 but I'm now reporting it as a believed FreeBSD kernel ZFS bug).
>Description:
In the uncommon situation that a ZFS zvol resides on a msdos partition, and that partition extends to the end of the drive, there can be a problem when the device is renamed (e.g. /dev/ad0->/dev/ada0, renaming of SCSI driver, or reattachment of drive to another bus).

Since the drive location no longer matches zpool.cache, dev_geom_attach_by_guid searches all GEOM devices for it.  The matching GUID may be found first at the end of /dev/ada0, and so the whole-disk device is attached instead of, for example, /dev/ada0s4

The pool is started, but because the start of /dev/ada0 is in the wrong place, the device is marked UNAVAIL.  If enough devices in the pool were renamed (especially in the case of FreeBSD 8->9 renaming, /dev/ad?->/dev/ada?) then a root-on-ZFS system will be unbootable as shown:

> ada0:  Previously was known as ad4
> [...]
> Trying to mount from zfs:tank/root [rw]...
> vdev_geom_open_by_guid:352[1]: Searching by guid [$number]
> vdev_geom_read_guid:239[1]: Reading guid from ada0
> vdev_geom_read_guid:273[1]: guid for ada0 is $number
> vdev_geom_attach:95[1]: Attaching to ada0.
> vdev_geom_attach:116[1]: Created geom consumer for ada0
> vdev_geom_open_by_guid:363[1]: Attach by guid [$number] succegged, provider /dev/ada0
> vdev_geom_detach:156[1]: Closing access to ada0
> vdev_geom_detach:160Mounting from zfs:tank/root failed with error 6 Destroyed consumer to ada0
>How-To-Repeat:
I find it easiest to reproduce with qemu:

Create a VM with -hda disk1.img -hdb disk2.img

Partition with /dev/ada?s1 (unused), and /dev/ada?s4 which extends to the end of the disk;  create a mirrored root zpool on /dev/ada0s4 and /dev/ada1s4 and set that up to become the root filesystem.

Reboot and verify this is working.

Now shut down, and 'move' the drives to the second IDE bus when restarting the VM:  -hda /dev/null -hdb /dev/null -hdc disk1.img -hdd disk2.img

Mounting from zfs:tank/root failed with error 6


Alternate method:  create the root zpool on IDE disks with FreeBSD 8, then try to upgrade to a FreeBSD 9 kernel.  Device renaming triggers the same issue.
>Fix:
GEOM partitions should ideally be checked before the whole disk.


The problem is easily avoided though by leaving a little space at the end of the drive.

For each drive in the zpool:  I detached, adjusted the ZFS partition's start+end sectors to leave some space at the end of the disk, zeroed that space, reattached it in the pool, resilvered and updated zpool.cache.  Re-testing as above, the /dev/ada?s4 partitions are then properly detected upon being moved.

Thanks!

>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list