g_mirror_access() dropping geom topology_lock [Was: Kernel crash trying to import a ZFS pool with log device]

Andriy Gapon avg at FreeBSD.org
Fri Mar 21 09:39:02 UTC 2014


on 20/03/2014 23:13 Andreas Longwitz said the following:
[snip]
> But if I now run "zpool export" and "zpool import" the kernel crashes:
> ...
> vdev_geom_open_by_path:554[1]: Found provider by name /dev/label/C325BL31.
> vdev_geom_attach:102[1]: Attaching to label/C325BL31.
> g_access(0xffffff0096b23a00(label/C325BL31), 1, 0, 1)
> open delta:[r1w0e1] old:[r0w0e0] provider:[r0w0e0]
> 0xffffff0002c0d400(label/C325BL31)
> g_access(0xffffff0002ba7300(da2), 1, 0, 2)
> open delta:[r1w0e2] old:[r0w0e0] provider:[r0w0e0] 0xffffff0002a23800(da2)
> g_disk_access(da2, 1, 0, 2)
> vdev_geom_attach:123[1]: Created geom and consumer for label/C325BL31.
> vdev_geom_read_config:248[1]: Reading config from label/C325BL31...
> vdev_geom_open_by_path:569[1]: guid match for provider /dev/label/C325BL31.
> vdev_geom_open_by_path:554[1]: Found provider by name /dev/label/C330CJHW.
> vdev_geom_attach:102[1]: Attaching to label/C330CJHW.
> g_access(0xffffff00969a5280(label/C330CJHW), 1, 0, 1)
> open delta:[r1w0e1] old:[r0w0e0] provider:[r0w0e0]
> 0xffffff0002c02100(label/C330CJHW)
> g_access(0xffffff0002b96280(da3), 1, 0, 2)
> open delta:[r1w0e2] old:[r0w0e0] provider:[r0w0e0] 0xffffff0002a23b00(da3)
> g_disk_access(da3, 1, 0, 2)
> vdev_geom_attach:143[1]: Created consumer for label/C330CJHW.
> vdev_geom_read_config:248[1]: Reading config from label/C330CJHW...
> vdev_geom_open_by_path:569[1]: guid match for provider /dev/label/C330CJHW.
> vdev_geom_open_by_path:554[1]: Found provider by name /dev/mirror/gm0p3.
> vdev_geom_attach:102[1]: Attaching to mirror/gm0p3.
> g_access(0xffffff0096b24180(mirror/gm0p3), 1, 0, 1)
> open delta:[r1w0e1] old:[r0w0e0] provider:[r0w0e0]
> 0xffffff00969c7e00(mirror/gm0p3)
> g_part_access(mirror/gm0p3,1,0,1)
> g_access(0xffffff0096c0f800(mirror/gm0), 1, 0, 1)
> open delta:[r1w0e1] old:[r8w8e16] provider:[r8w8e16]
> 0xffffff00969c7d00(mirror/gm0)
> GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e1.

The following part of the log is very informative.
Thank you for capturing it.

> vdev_geom_attach:143[1]: Created consumer for mirror/gm0p3.
> vdev_geom_read_config:248[1]: Reading config from mirror/gm0p3...
> vdev_geom_open_by_path:569[1]: guid match for provider /dev/mirror/gm0p3.

I read the above as thread A entering vdev_geom_open_by_path and trying to
"taste" /dev/mirror/gm0p3.

> g_post_event_x(0xffffffff80b16830, 0xffffff0096b24180, 2, 0)
> vdev_geom_detach:163[1]: Closing access to mirror/gm0p3.
> g_access(0xffffff0096b24180(mirror/gm0p3), -1, 0, -1)
> open delta:[r-1w0e-1] old:[r1w0e1] provider:[r1w0e1]
> 0xffffff00969c7e00(mirror/gm0p3)


Simultaneously thread B is closing access /dev/mirror/gm0p3.
It is not clear from the quoted log when and how this thread B got access to the
device in the first place.

> g_part_access(mirror/gm0p3,-1,0,-1)
> g_access(0xffffff0096c0f800(mirror/gm0), -1, 0, -1)
> open delta:[r-1w0e-1] old:[r9w8e17] provider:[r9w8e17]
> 0xffffff00969c7d00(mirror/gm0)
> GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w0e-1.
> vdev_geom_open_by_path:554[1]: Found provider by name /dev/mirror/gm0p3.
> vdev_geom_attach:102[1]: Attaching to mirror/gm0p3.
> vdev_geom_attach:128[1]: Found consumer for mirror/gm0p3.
> g_access(0xffffff0096b24180(mirror/gm0p3), 1, 0, 1)
> open delta:[r1w0e1] old:[r1w0e1] provider:[r1w0e1]
> 0xffffff00969c7e00(mirror/gm0p3)

Thread A is accessing the device.

> g_part_access(mirror/gm0p3,1,0,1)
> g_access(0xffffff0096c0f800(mirror/gm0), 1, 0, 1)
> Open delta:[r1w0e1] old:[r9w8e17] provider:[r9w8e17]
> 0xffffff00969c7d00(mirror/gm0)
> GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e1.
> vdev_geom_detach:167[1]: Destroyed consumer to mirror/gm0p3.
> g_detach(0xffffff0096b24180)
> g_destroy_consumer(0xffffff0096b24180)

Thread B is destroying a special ZFS "taster" consumer.

> vdev_geom_attach:147[1]: Used existing consumer for mirror/gm0p3.

Thread A is trying to re-use the taster consumer, which has just been destroyed.

> vdev_geom_read_config:248[1]:
> 
> Fatal trap 12: page fault while in kernel mode

Boom!

I see two issues here.
First, the ZFS tasting code could be made more robust.  If it never tried to
re-use the consumer and always created a new one, then most likely this crash
could be avoided.  But there is no bug in the code.  The code is correct and it
it uses GEOM topology lock to avoid any concurrency issues.

But GEOM mirror code breaks a contract on which the ZFS code relies.
g_access() must be called with the topology lock hold.
I extend this requirement to a requirement that access method of any GEOM
provider must operate under the topology lock and must never drop it.
In other words, if a caller must acquire g_topology_lock before calling
g_access, then in return it must have a guarantee that the GEOM topology stays
unchanged across the call to g_access().
g_mirror_access() breaks the above contract.

So, the code in vdev_geom_attach() obtains g_topology_lock, then it finds an
existing valid consumer and calls g_access() on it.  It reasonably expects that
the consumer remains valid, but because g_mirror_access() drops and requires the
topology lock, there is a chance that the topology can change and the consumer
may become invalid.

I am not very familiar with gmirror code, so I am not sure how to fix the
problem from that end.

> cpuid = 1; apic id = 01
> fault virtual address   = 0x0
> fault code              = supervisor read data, page not present
> instruction pointer     = 0x20:0xffffffff80b16f01
> stack pointer           = 0x28:0xffffff82452325b0
> frame pointer           = 0x28:0xffffff8245232650
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 15494 (initial thread)
> [thread pid 15494 tid 100151 ]
> Stopped at      vdev_geom_read_config+0x71:     movq    (%rdx),%rsi
> 
> (kgdb) where
> ...
> #9  0xffffffff805dce1b in trap (frame=0xffffff8245232500) at
> /usr/src/sys/amd64/amd64/trap.c:457
> #10 0xffffffff805c3024 in calltrap () at
> /usr/src/sys/amd64/amd64/exception.S:228
> #11 0xffffffff80b16f01 in vdev_geom_read_config (cp=0xffffff0096b24180,
> config=0xffffff8245232670)
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:248
> #12 0xffffffff80b17194 in vdev_geom_read_guid (cp=<value optimized out>)
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:454
> #13 0xffffffff80b172f0 in vdev_geom_open_by_path (vd=0xffffff0002b2f000,
> check_guid=1)
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:559
> #14 0xffffffff80b17528 in vdev_geom_open (vd=0xffffff0002b2f000,
> psize=0xffffff8245232760, max_psize=0xffffff8245232758,
>     ashift=0xffffff8245232750) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:608
> #15 0xffffffff80aca87a in vdev_open (vd=0xffffff0002b2f000)
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:1153
> #16 0xffffffff80acac5e in vdev_reopen (vd=0xffffff0002b2f000)
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:1514
> #17 0xffffffff80ab84e0 in spa_load (spa=0xffffff0002b85000,
> state=SPA_LOAD_TRYIMPORT, type=SPA_IMPORT_EXISTING,
>     mosconfig=<value optimized out>) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1654
> #18 0xffffffff80abaa40 in spa_tryimport (tryconfig=0xffffff00024b2260)
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:4184
> #19 0xffffffff80afb486 in zfs_ioc_pool_tryimport (zc=0xffffff8001f1d000)
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:1630
> #20 0xffffffff80afea7f in zfsdev_ioctl (dev=<value optimized out>,
> zcmd=<value optimized out>, arg=0xffffff00966154c0 "\003",
>     flag=3, td=<value optimized out>) at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:5945
> #21 0xffffffff8037729b in devfs_ioctl_f (fp=0xffffff0002c98960,
> com=3222821382, data=<value optimized out>,
>     cred=<value optimized out>, td=0xffffff0096017000) at
> /usr/src/sys/fs/devfs/devfs_vnops.c:700
> #22 0xffffffff80444b22 in kern_ioctl (td=<value optimized out>,
> fd=<value optimized out>, com=3222821382,
>     data=0xffffff00966154c0 "\003") at file.h:277
> #23 0xffffffff80444d5d in ioctl (td=0xffffff0096017000,
> uap=0xffffff8245232bb0) at /usr/src/sys/kern/sys_generic.c:679
> #24 0xffffffff805dbca4 in amd64_syscall (td=0xffffff0096017000,
> traced=0) at subr_syscall.c:114
> #25 0xffffffff805c331c in Xfast_syscall () at
> /usr/src/sys/amd64/amd64/exception.S:387
> #26 0x0000000180fcec2c in ?? ()
> 
> (kgdb) f 11
> #11 0xffffffff80b16f01 in vdev_geom_read_config (cp=0xffffff0096b24180,
> config=0xffffff8245232670)
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:248
> 248             ZFS_LOG(1, "Reading config from %s...", pp->name);
> (kgdb) list
> 243             int error, l, len;
> 244
> 245             g_topology_assert_not();
> 246
> 247             pp = cp->provider;
> 248             ZFS_LOG(1, "Reading config from %s...", pp->name);
> 249
> 250             psize = pp->mediasize;
> 251             psize = P2ALIGN(psize, (uint64_t)sizeof(vdev_label_t));
> 252
> (kgdb) p *cp
> $1 = {geom = 0xffffff0002c0dd00, consumer = {le_next =
> 0xffffff00969a5280, le_prev = 0xffffff0002c0dd20}, provider = 0x0,
>   consumers = {le_next = 0xffffff009607bb00, le_prev =
> 0xffffff00969c7e20}, acr = 1, acw = 0, ace = 1, spoiled = 0,
>   stat = 0xffffff0002bed5a0, nstart = 17, nend = 17, private = 0x0,
> index = 0}
> (kgdb) p *cp->provider
> Cannot access memory at address 0x0
> 
> (kgdb) f 12
> #12 0xffffffff80b17194 in vdev_geom_read_guid (cp=<value optimized out>)
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:454
> 454             if (vdev_geom_read_config(cp, &config) == 0) {
> (kgdb) p *cp
> $2 = {geom = 0xffffff0002c0dd00, consumer = {le_next =
> 0xffffff00969a5280, le_prev = 0xffffff0002c0dd20}, provider = 0x0,
>   consumers = {le_next = 0xffffff009607bb00, le_prev =
> 0xffffff00969c7e20}, acr = 1, acw = 0, ace = 1, spoiled = 0,
>   stat = 0xffffff0002bed5a0, nstart = 17, nend = 17, private = 0x0,
> index = 0}
> (kgdb) info local
> config = (nvlist_t *) 0xffffff0002b85000
> guid = 0
> (kgdb) list
> 449             uint64_t guid;
> 450
> 451             g_topology_assert_not();
> 452
> 453             guid = 0;
> 454             if (vdev_geom_read_config(cp, &config) == 0) {
> 455                     guid = nvlist_get_guid(config);
> 456                     nvlist_free(config);
> 457             }
> 458             return (guid);
> 
> (kgdb) f 13
> #13 0xffffffff80b172f0 in vdev_geom_open_by_path (vd=0xffffff0002b2f000,
> check_guid=1)
>     at
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c:559
> 559                             guid = vdev_geom_read_guid(cp);
> (kgdb) list
> 554                     ZFS_LOG(1, "Found provider by name %s.",
> vd->vdev_path);
> 555                     cp = vdev_geom_attach(pp);
> 556                     if (cp != NULL && check_guid &&
> ISP2(pp->sectorsize) &&
> 557                         pp->sectorsize <= VDEV_PAD_SIZE) {
> 558                             g_topology_unlock();
> 559                             guid = vdev_geom_read_guid(cp);
> 560                             g_topology_lock();
> 561                             if (guid != vd->vdev_guid) {
> 562                                     vdev_geom_detach(cp, 0);
> 563                                     cp = NULL;
> (kgdb) info local
> pp = (struct g_provider *) 0xffffff00969c7e00
> cp = (struct g_consumer *) 0xffffff0096b24180
> guid = <value optimized out>
> __func__ = "ÿÿ\000\000H\213uÀ\211Ø\211\235üþÿÿ\203À\001\205ÀH\213"
> (kgdb) p *cp
> $3 = {geom = 0xffffff0002c0dd00, consumer = {le_next =
> 0xffffff00969a5280, le_prev = 0xffffff0002c0dd20}, provider = 0x0,
>   consumers = {le_next = 0xffffff009607bb00, le_prev =
> 0xffffff00969c7e20}, acr = 1, acw = 0, ace = 1, spoiled = 0,
>   stat = 0xffffff0002bed5a0, nstart = 17, nend = 17, private = 0x0,
> index = 0}
> (kgdb) p *pp
> $4 = {name = 0xffffff00969c7e88 "mirror/gm0p3", provider = {le_next =
> 0xffffff009698a100, le_prev = 0xffffff0002c03208},
>   geom = 0xffffff0002c68000, consumers = {lh_first =
> 0xffffff009607bb00}, acr = 1, acw = 0, ace = 1, error = 0, orphan = {
>     tqe_next = 0x0, tqe_prev = 0x0}, mediasize = 8589934592, sectorsize
> = 512, stripesize = 0, stripeoffset = 226575360,
>   stat = 0xffffff0002be7120, nstart = 157, nend = 157, flags = 0,
> private = 0xffffff00969c7c00, index = 2}
> (kgdb) quit
> 
> The technical reason for the crash is that in "f 11" we have pp =
> cp->provider = 0.
> I can give more information from the kernel dump, also I can easy repeat
> the crash.
> 


-- 
Andriy Gapon


More information about the freebsd-fs mailing list