svn commit: r294329 - in head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys
Nikolai Lifanov
lifanov at mail.lifanov.com
Tue Jan 19 21:25:26 UTC 2016
On 01/19/16 16:02, Nikolai Lifanov wrote:
> On 01/19/16 15:52, Kurt Lidl wrote:
>> On 1/19/16 1:55 PM, Alan Somers wrote:
>>> On Tue, Jan 19, 2016 at 10:00 AM, Alan Somers <asomers at freebsd.org>
>>> wrote:
>>>> Author: asomers
>>>> Date: Tue Jan 19 17:00:25 2016
>>>> New Revision: 294329
>>>> URL: https://svnweb.freebsd.org/changeset/base/294329
>>>>
>>>> Log:
>>>> Disallow zvol-backed ZFS pools
>>>>
>>>> Using zvols as backing devices for ZFS pools is fraught with
>>>> panics and
>>>> deadlocks. For example, attempting to online a missing device in the
>>>> presence of a zvol can cause a panic when vdev_geom tastes the
>>>> zvol. Better
>>>> to completely disable vdev_geom from ever opening a zvol. The
>>>> solution
>>>> relies on setting a thread-local variable during vdev_geom_open, and
>>>> returning EOPNOTSUPP during zvol_open if that thread-local
>>>> variable is set.
>>>>
>>>> Remove the check for MUTEX_HELD(&zfsdev_state_lock) in zvol_open.
>>>> Its intent
>>>> was to prevent a recursive mutex acquisition panic. However, the
>>>> new check
>>>> for the thread-local variable also fixes that problem.
>>>>
>>>> Also, fix a panic in vdev_geom_taste_orphan. For an unknown
>>>> reason, this
>>>> function was set to panic. But it can occur that a device
>>>> disappears during
>>>> tasting, and it causes no problems to ignore this departure.
>>>>
>>>> Reviewed by: delphij
>>>> MFC after: 1 week
>>>> Relnotes: yes
>>>> Sponsored by: Spectra Logic Corp
>>>> Differential Revision: https://reviews.freebsd.org/D4986
>>>>
>>>> Modified:
>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h
>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c
>>>>
>>>> Modified:
>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h
>>>
>>> Due to popular demand, I will conditionalize this behavior on a
>>> sysctl, and I won't MFC it. The sysctl must default to off (ZFS on
>>> zvols not allowed) because having the ability to put pools on zvols
>>> can cause panics even for users who aren't using it.
>>
>> Thank you!
>>
>>> And let me clear up some confusion:
>>>
>>> 1) Having the ability to put a zpool on a zvol can cause panics and
>>> deadlocks, even if that ability is unused.
>>> 2) Putting a zpool atop a zvol causes unnecessary performance problems
>>> because there are two layers of COW involved, with all their software
>>> complexities. This also applies to putting a zpool atop files on a
>>> ZFS filesystem.
>>> 3) A VM guest putting a zpool on its virtual disk, where the VM host
>>> backs that virtual disk with a zvol, will work fine. That's the ideal
>>> use case for zvols.
>>> 3b) Using ZFS on both host and guest isn't ideal for performance, as
>>> described in item 2. That's why I prefer to use UFS for VM guests.
>>
>> The patch as is does very much break the way some people do operations
>> on zvols. My script that does virtual machine cloning via snapshots
>> of zvols containing zpools is currently broken due to this. (I upgraded
>> one of my dev hosts right after your commit, to verify the broken
>> behavior.)
>>
>> In my script, I boot an auto-install .iso into bhyve:
>>
>> bhyve -c 2 -m ${vmmem} -H -A -I -g 0 \
>> -s 0:0,hostbridge \
>> -s 1,lpc -l com1,stdio \
>> -s 2:0,virtio-net,${template_tap} \
>> -s 3:0,ahci-hd,"${zvol}" \
>> -s 4:0,ahci-cd,"${isofile}" \
>> ${vmname} || \
>> echo "trapped error exit from bhyve: $?"
>>
>> So, yes, the zpool gets created by the client VM. Then on
>> the hypervisor host, the script imports that zpool and renames it,
>> so that I can have different pool names for all the client VMs.
>> This step now fails:
>>
>> + zpool import -R /virt/base -d /dev/zvol/zdata sys base
>> cannot import 'sys' as 'base': no such pool or dataset
>> Destroy and re-create the pool from
>> a backup source.
>>
>> I import the clients' zpools after the zpools on them has
>> been renamed, so the hypervisor host can manipulate the
>> files directly. It only disturbs a small amount of the
>> disk blocks on each of the snapshots of the zvol to rename
>> the zpools.
>>
>> In this way, I can instantiate ~30 virtual machines from
>> a custom install.iso image in less than 3 minutes. And
>> the bulk of that time is doing the installation from the
>> custom install.iso into the first virtual machine. The
>> cloning of the zvols, and manipulation of the resulting
>> filesystems is very fast.
>>
>
> Can't you just set volmode=dev and use zfs clone?
>
Never mind, you want different pool names to manipulate files directly.
>> -Kurt
>>
>>
>>
>> _______________________________________________
>> svn-src-head at freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/svn-src-head
>> To unsubscribe, send any mail to "svn-src-head-unsubscribe at freebsd.org"
>
> _______________________________________________
> svn-src-head at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/svn-src-head
> To unsubscribe, send any mail to "svn-src-head-unsubscribe at freebsd.org"
>
More information about the svn-src-all
mailing list