[Bug 221060] zfs: sending/receiving a zvol within same host to the same dataset produces errors and shadow child(!) datasets

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Fri Jul 28 04:25:21 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221060

            Bug ID: 221060
           Summary: zfs: sending/receiving a zvol within same host to the
                    same dataset produces errors and shadow child(!)
                    datasets
           Product: Base System
           Version: 11.0-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs at FreeBSD.org
          Reporter: emz at norma.perm.ru

Sending/receiving a zvol within same host produces errors and shadow child (to
a zvol !) datasets. 

For instance, it's absolutely normal to send zvol betweend hosts,

(just an example, not real output)
zfs send -Rv foo/bar at snapshot | ssh -l user remote sudo zfs receive -du tank

It's even possible to send a zvol to another pool and/or dataset:

(from now on real output from FreeBSD 11.1-BETA2)

[root at san1:~]# zfs send -Rv zfsroot/userdata/worker121 at candidate | zfs receive
-e esx/userdata
full send of zfsroot/userdata/worker121 at candidate estimated size is 1,03G
total estimated size is 1,03G
TIME        SENT   SNAPSHOT
06:57:59    523M   zfsroot/userdata/worker121 at candidate

But it's not possible to overwrite an existing zvol on any pool using -d
argument:

[root at san1:~]# zfs create -V 8G esx/userdata/workerX
[root at san1:~]# zfs send -Rv zfsroot/userdata/worker121 at candidate | zfs receive
-d esx/userdata/workerX   
full send of zfsroot/userdata/worker121 at candidate estimated size is 1,03G
total estimated size is 1,03G
TIME        SENT   SNAPSHOT
cannot open 'esx/userdata/workerX': operation not applicable to datasets of
this type
cannot receive new filesystem stream: unable to restore to destination
warning: cannot send 'zfsroot/userdata/worker121 at candidate': signal received


Okay. May be it's not technically possible and should not be done either. 
But wait.... it's possible when using -e receive argument:


[root at san1:~]# zfs send -Rv zfsroot/userdata/worker121 at candidate | zfs receive
-e esx/userdata/workerX
full send of zfsroot/userdata/worker121 at candidate estimated size is 1,03G
total estimated size is 1,03G
TIME        SENT   SNAPSHOT
07:04:14    559M   zfsroot/userdata/worker121 at candidate

Okay. Now lets try to remove this new overwritten zvol:

[root at san1:~]# zfs destroy esx/userdata/workerX
cannot destroy 'esx/userdata/workerX': dataset already exists

Now about the bugs:

1) the error is cryptic and unclear. I found out what it means, I'll show it
below.
2) The receive -e operation in the case produces errors in dmesg:

g_dev_taste: make_dev_p() failed (gp->name=zvol/esx/userdata/workerX/worker121,
error=17)
g_dev_taste: make_dev_p() failed
(gp->name=zvol/esx/userdata/workerX/worker121 at candidate, error=17)
g_dev_taste: make_dev_p() failed
(gp->name=zvol/esx/userdata/workerX/worker121s1, error=17)
g_dev_taste: make_dev_p() failed
(gp->name=zvol/esx/userdata/workerX/worker121 at candidates1, error=17)

3) this operation creates a shadow dataset, not visible to zfs list -t all (a
child to the zvol - this is weird by itself, isn't it):

[root at san1:~]# zfs list -t all | more
NAME                                   USED  AVAIL  REFER  MOUNTPOINT
esx                                   4,10T  12,5T   500M  /esx
esx/shared                            3,72T  12,5T  3,72T  -

[ ... loads of esx/shared and esx/userdata children, but trust me, it's not
there - I removed it so it doen't encumber this PR ... ]

esx/userdata/workerX                  8,84G  12,5T  19,2K  -

But it's visible in zdb -d <pool>:

[root at san1:~]# zdb -d esx | grep workerX
Dataset esx/userdata/workerX/worker121 at candidate [ZVOL], ID 472, cr_txg
76762261, 599M, 2 objects
Dataset esx/userdata/workerX/worker121 [ZVOL], ID 462, cr_txg 76762256, 599M, 2
objects
Dataset esx/userdata/workerX [ZVOL], ID 438, cr_txg 76762220, 19.2K, 2 objects

Okay, this is really what the 'zfs destroy' is trying to tell. Funny thing,
this shadow dataset should be clearable with destroy -r flag, but it isn't:

[root at san1:~]# zfs destroy -r esx/userdata/workerX          
cannot destroy 'esx/userdata/workerX': dataset already exists

Only explicit destroy kills it (may be because the child to a zvol is an
artifact by design):

[root at san1:~]# zfs destroy -r esx/userdata/workerX/worker121
[root at san1:~]#

Things may become even more complicated, and zdb may give you the Input/output
error on the pool, or even crash (this state can be cleared with
zpool/export/import):


Here's the bt after several 'zdb -d esx' messages 'Input/output error' and the
crash:


[root at san1:~]# gdb zdb zdb.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Core was generated by `zdb -d esx/userdata/workerX'.
Program terminated with signal 6, Aborted.
Reading symbols from /lib/libnvpair.so.2...Reading symbols from
/usr/lib/debug//lib/libnvpair.so.2.debug...done.
done.
Loaded symbols for /lib/libnvpair.so.2
Reading symbols from /lib/libumem.so.2...Reading symbols from
/usr/lib/debug//lib/libumem.so.2.debug...done.
done.
Loaded symbols for /lib/libumem.so.2
Reading symbols from /lib/libuutil.so.2...Reading symbols from
/usr/lib/debug//lib/libuutil.so.2.debug...done.
done.
Loaded symbols for /lib/libuutil.so.2
Reading symbols from /lib/libzfs.so.2...Reading symbols from
/usr/lib/debug//lib/libzfs.so.2.debug...done.
done.
Loaded symbols for /lib/libzfs.so.2
Reading symbols from /lib/libzpool.so.2...Reading symbols from
/usr/lib/debug//lib/libzpool.so.2.debug...done.
done.
Loaded symbols for /lib/libzpool.so.2
Reading symbols from /lib/libc.so.7...Reading symbols from
/usr/lib/debug//lib/libc.so.7.debug...done.
done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /lib/libmd.so.6...Reading symbols from
/usr/lib/debug//lib/libmd.so.6.debug...done.
done.
Loaded symbols for /lib/libmd.so.6
Reading symbols from /lib/libutil.so.9...Reading symbols from
/usr/lib/debug//lib/libutil.so.9.debug...done.
done.
Loaded symbols for /lib/libutil.so.9
Reading symbols from /lib/libm.so.5...Reading symbols from
/usr/lib/debug//lib/libm.so.5.debug...done.
done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /lib/libavl.so.2...Reading symbols from
/usr/lib/debug//lib/libavl.so.2.debug...done.
done.
Loaded symbols for /lib/libavl.so.2
Reading symbols from /lib/libbsdxml.so.4...Reading symbols from
/usr/lib/debug//lib/libbsdxml.so.4.debug...done.
done.
Loaded symbols for /lib/libbsdxml.so.4
Reading symbols from /lib/libgeom.so.5...Reading symbols from
/usr/lib/debug//lib/libgeom.so.5.debug...done.
done.
Loaded symbols for /lib/libgeom.so.5
Reading symbols from /lib/libz.so.6...Reading symbols from
/usr/lib/debug//lib/libz.so.6.debug...done.
done.
Loaded symbols for /lib/libz.so.6
Reading symbols from /lib/libzfs_core.so.2...Reading symbols from
/usr/lib/debug//lib/libzfs_core.so.2.debug...done.
done.
Loaded symbols for /lib/libzfs_core.so.2
Reading symbols from /lib/libthr.so.3...Reading symbols from
/usr/lib/debug//lib/libthr.so.3.debug...done.
done.
Loaded symbols for /lib/libthr.so.3
Reading symbols from /lib/libsbuf.so.6...Reading symbols from
/usr/lib/debug//lib/libsbuf.so.6.debug...done.
done.
Loaded symbols for /lib/libsbuf.so.6
Reading symbols from /libexec/ld-elf.so.1...Reading symbols from
/usr/lib/debug//libexec/ld-elf.so.1.debug...done.
done.
Loaded symbols for /libexec/ld-elf.so.1
#0  0x000000080158184a in thr_kill () from /lib/libc.so.7
(gdb) bt
#0  0x000000080158184a in thr_kill () from /lib/libc.so.7
#1  0x0000000801581814 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
#2  0x0000000801581789 in abort () at /usr/src/lib/libc/stdlib/abort.c:65
#3  0x00000008011a1471 in ddt_load (spa=<value optimized out>) at assfail.h:75
#4  0x0000000801124188 in spa_load_impl (spa=0x803052c00, pool_guid=<value
optimized out>, config=0x803052ee8, 
    state=<value optimized out>, type=SPA_IMPORT_EXISTING, mosconfig=B_TRUE)
    at
/usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:2844
#5  0x000000080111ca68 in spa_load (spa=<value optimized out>, state=<value
optimized out>, type=SPA_IMPORT_EXISTING, 
    mosconfig=B_TRUE) at
/usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:2230
#6  0x0000000801123b51 in spa_load_impl (spa=0x803052c00, pool_guid=<value
optimized out>, config=0x803052ee8, 
    state=<value optimized out>, type=SPA_IMPORT_EXISTING, mosconfig=B_FALSE)
    at
/usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:2654
#7  0x000000080111ca68 in spa_load (spa=<value optimized out>, state=<value
optimized out>, type=SPA_IMPORT_EXISTING, 
    mosconfig=B_FALSE) at
/usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:2230
#8  0x000000080111c32a in spa_load_best (spa=0x803052c00, state=SPA_LOAD_OPEN,
mosconfig=0, 
    max_request=<value optimized out>, rewind_flags=<value optimized out>)
    at
/usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:3007
#9  0x0000000801118289 in spa_open_common (pool=<value optimized out>,
spapp=<value optimized out>, tag=0x801206642, 
    nvpolicy=<value optimized out>, config=0x0)
    at
/usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:3159
#10 0x0000000801142292 in dsl_pool_hold (name=<value optimized out>,
tag=0x801206642, dp=0x7fffffe5a958)
    at
/usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c:1111
#11 0x000000080116b29f in dmu_objset_own (name=0x7fffffffed87
"esx/userdata/workerX", type=DMU_OST_ANY, 
    readonly=B_TRUE, tag=0x4107dd, osp=0x7fffffe5aa40)
    at
/usr/src/cddl/lib/libzpool/../../../sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:570
#12 0x00000000004062b5 in main (argc=<value optimized out>, argv=<value
optimized out>)
    at
/usr/src/cddl/usr.sbin/zdb/../../../cddl/contrib/opensolaris/cmd/zdb/zdb.c:3792
#13 0x00000000004053df in _start ()
#14 0x0000000800638000 in ?? ()
#15 0x0000000000000000 in ?? ()
(gdb)

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list