zfs pool import hangs on [tx->tx_sync_done_cv]

K. Macy kmacy at freebsd.org
Mon Oct 13 19:14:00 UTC 2014


>>> Yer I would have got the zio details but typically its "optimised out" by
>>> the
>>> compiler, so will need some effort to track that down unfortunately :(
>>>
>>
>> Well, let me know if you can. Re-creating a new 10.x VM is taking a while
>> as it's taking me forever to checkout the sources.
>>
>> Things like that need to somehow continue to be accessible.
>
>
> I believe there's some pool corruption here somewhere as every once in a
> while
> I trip and ASSERT panic:
> panic: solaris assert: size >= SPA_MINBLOCKSIZE ||
> range_tree_space(msp->ms_tree) == 0, file:
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c,
> line: 1636
>


<... snip>

You are correct.

(kgdb) p ((zio_t *)$r14)->io_reexecute
$32 = 2 '\002'
(kgdb) p ((zio_t *)$r14)->io_flags
$33 = 0
(kgdb) p ((zio_t *)$r14)->io_spa->spa_suspended
$34 = 1 '\001'

This means zio_suspend has been called from zio_done:
 else if (zio->io_reexecute & ZIO_REEXECUTE_SUSPEND) {
/*
* We'd fail again if we reexecuted now, so suspend
* until conditions improve (e.g. device comes online).
*/
zio_suspend(spa, zio);
}

If failure mode were panic we would have panicked when attempting the import:
void
zio_suspend(spa_t *spa, zio_t *zio)
{
if (spa_get_failmode(spa) == ZIO_FAILURE_MODE_PANIC)
fm_panic("Pool '%s' has encountered an uncorrectable I/O "
   "failure and the failure mode property for this pool "
"is set to panic.", spa_name(spa));


More information about the freebsd-fs mailing list