svn commit: r294329 - in head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys
Nikolai Lifanov
lifanov at mail.lifanov.com
Tue Jan 19 21:10:38 UTC 2016
On 01/19/16 15:52, Kurt Lidl wrote:
> On 1/19/16 1:55 PM, Alan Somers wrote:
>> On Tue, Jan 19, 2016 at 10:00 AM, Alan Somers <asomers at freebsd.org>
>> wrote:
>>> Author: asomers
>>> Date: Tue Jan 19 17:00:25 2016
>>> New Revision: 294329
>>> URL: https://svnweb.freebsd.org/changeset/base/294329
>>>
>>> Log:
>>> Disallow zvol-backed ZFS pools
>>>
>>> Using zvols as backing devices for ZFS pools is fraught with
>>> panics and
>>> deadlocks. For example, attempting to online a missing device in the
>>> presence of a zvol can cause a panic when vdev_geom tastes the
>>> zvol. Better
>>> to completely disable vdev_geom from ever opening a zvol. The
>>> solution
>>> relies on setting a thread-local variable during vdev_geom_open, and
>>> returning EOPNOTSUPP during zvol_open if that thread-local
>>> variable is set.
>>>
>>> Remove the check for MUTEX_HELD(&zfsdev_state_lock) in zvol_open.
>>> Its intent
>>> was to prevent a recursive mutex acquisition panic. However, the
>>> new check
>>> for the thread-local variable also fixes that problem.
>>>
>>> Also, fix a panic in vdev_geom_taste_orphan. For an unknown
>>> reason, this
>>> function was set to panic. But it can occur that a device
>>> disappears during
>>> tasting, and it causes no problems to ignore this departure.
>>>
>>> Reviewed by: delphij
>>> MFC after: 1 week
>>> Relnotes: yes
>>> Sponsored by: Spectra Logic Corp
>>> Differential Revision: https://reviews.freebsd.org/D4986
>>>
>>> Modified:
>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h
>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
>>>
>>> Modified:
>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h
>>
>> Due to popular demand, I will conditionalize this behavior on a
>> sysctl, and I won't MFC it. The sysctl must default to off (ZFS on
>> zvols not allowed) because having the ability to put pools on zvols
>> can cause panics even for users who aren't using it.
>
> Thank you!
>
>> And let me clear up some confusion:
>>
>> 1) Having the ability to put a zpool on a zvol can cause panics and
>> deadlocks, even if that ability is unused.
>> 2) Putting a zpool atop a zvol causes unnecessary performance problems
>> because there are two layers of COW involved, with all their software
>> complexities. This also applies to putting a zpool atop files on a
>> ZFS filesystem.
>> 3) A VM guest putting a zpool on its virtual disk, where the VM host
>> backs that virtual disk with a zvol, will work fine. That's the ideal
>> use case for zvols.
>> 3b) Using ZFS on both host and guest isn't ideal for performance, as
>> described in item 2. That's why I prefer to use UFS for VM guests.
>
> The patch as is does very much break the way some people do operations
> on zvols. My script that does virtual machine cloning via snapshots
> of zvols containing zpools is currently broken due to this. (I upgraded
> one of my dev hosts right after your commit, to verify the broken
> behavior.)
>
> In my script, I boot an auto-install .iso into bhyve:
>
> bhyve -c 2 -m ${vmmem} -H -A -I -g 0 \
> -s 0:0,hostbridge \
> -s 1,lpc -l com1,stdio \
> -s 2:0,virtio-net,${template_tap} \
> -s 3:0,ahci-hd,"${zvol}" \
> -s 4:0,ahci-cd,"${isofile}" \
> ${vmname} || \
> echo "trapped error exit from bhyve: $?"
>
> So, yes, the zpool gets created by the client VM. Then on
> the hypervisor host, the script imports that zpool and renames it,
> so that I can have different pool names for all the client VMs.
> This step now fails:
>
> + zpool import -R /virt/base -d /dev/zvol/zdata sys base
> cannot import 'sys' as 'base': no such pool or dataset
> Destroy and re-create the pool from
> a backup source.
>
> I import the clients' zpools after the zpools on them has
> been renamed, so the hypervisor host can manipulate the
> files directly. It only disturbs a small amount of the
> disk blocks on each of the snapshots of the zvol to rename
> the zpools.
>
> In this way, I can instantiate ~30 virtual machines from
> a custom install.iso image in less than 3 minutes. And
> the bulk of that time is doing the installation from the
> custom install.iso into the first virtual machine. The
> cloning of the zvols, and manipulation of the resulting
> filesystems is very fast.
>
Can't you just set volmode=dev and use zfs clone?
> -Kurt
>
>
>
> _______________________________________________
> svn-src-head at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/svn-src-head
> To unsubscribe, send any mail to "svn-src-head-unsubscribe at freebsd.org"
More information about the svn-src-all
mailing list