ZFS:dmu_objset_find_dp_impl() - panic: vm_fault: fault on nofault entry, addr: fffffe0094653000

Fabian Keil freebsd-listen at fabiankeil.de
Tue Dec 22 15:24:46 UTC 2015


Using a kernel based on r292334, I got this panic while importing
a ZFS pool with vfs.zfs.spa_load_verify_data and
vfs.zfs.spa_load_verify_metadata set to 0.

I've not been able to reproduce it yet and the changed sysctl's above
may not actually matter (but I usually use the defaults).

The pool has a single leaf vdev that is backed by ggatec which transfers the
data over a slow and easily saturated connection (< ~120 kB/s up). Graph:
https://www.fabiankeil.de/talks/versteckter-block-speicher/mgp00030.html

fk at r500 /usr/crash $kgdb /usr/lib/debug/boot/kernel/kernel.debug vmcore.2 
[...]
Unread portion of the kernel message buffer:
[11912] panic: vm_fault: fault on nofault entry, addr: fffffe0094653000
[11912] cpuid = 0
[11912] KDB: stack backtrace:
[...]
#0  doadump (textdump=0) at pcpu.h:221
221	pcpu.h: No such file or directory.
	in pcpu.h
(kgdb) where
#0  doadump (textdump=0) at pcpu.h:221
#1  0xffffffff8031752b in db_dump (dummy=<value optimized out>, dummy2=false, dummy3=0, dummy4=0x0) at /usr/src/sys/ddb/db_command.c:533
#2  0xffffffff8031731e in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:440
#3  0xffffffff803170b4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493
#4  0xffffffff80319bbb in db_trap (type=<value optimized out>, code=0) at /usr/src/sys/ddb/db_main.c:251
#5  0xffffffff805e2dc3 in kdb_trap (type=3, code=0, tf=<value optimized out>) at /usr/src/sys/kern/subr_kdb.c:654
#6  0xffffffff8087f207 in trap (frame=0xfffffe0094f8f220) at /usr/src/sys/amd64/amd64/trap.c:549
#7  0xffffffff808641b7 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:234
#8  0xffffffff805e24ab in kdb_enter (why=0xffffffff8097216b "panic", msg=0x32 <Address 0x32 out of bounds>) at cpufunc.h:63
#9  0xffffffff8059ea4f in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:750
#10 0xffffffff8059e8a3 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:688
#11 0xffffffff80835650 in vm_fault_hold (map=<value optimized out>, vaddr=<value optimized out>, fault_type=<value optimized out>, fault_flags=<value optimized out>, m_hold=<value optimized out>)
    at /usr/src/sys/vm/vm_fault.c:332
#12 0xffffffff808332f8 in vm_fault (map=0xfffff80002000000, vaddr=<value optimized out>, fault_type=1 '\001', fault_flags=0) at /usr/src/sys/vm/vm_fault.c:277
#13 0xffffffff8087f97a in trap_pfault (frame=0xfffffe0094f8f8d0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:734
#14 0xffffffff8087f21e in trap (frame=0xfffffe0094f8f8d0) at /usr/src/sys/amd64/amd64/trap.c:435
#15 0xffffffff808641b7 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:234
#16 0xffffffff81900c9a in dmu_objset_find_dp_impl (dcp=0xfffff80078cb0200) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:1630
#17 0xffffffff81901189 in dmu_objset_find_dp_cb (arg=0xfffff80078cb0200) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:1746
#18 0xffffffff818ab8d1 in taskq_run (arg=0xfffff800066d3d20, pending=1) at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c:109
#19 0xffffffff805f3c8b in taskqueue_run_locked (queue=0xfffff800666b2500) at /usr/src/sys/kern/subr_taskqueue.c:430
#20 0xffffffff805f4ad8 in taskqueue_thread_loop (arg=<value optimized out>) at /usr/src/sys/kern/subr_taskqueue.c:683
#21 0xffffffff8055c77c in fork_exit (callout=0xffffffff805f4a00 <taskqueue_thread_loop>, arg=0xfffff80060937470, frame=0xfffffe0094f8fc00) at /usr/src/sys/kern/kern_fork.c:1011
#22 0xffffffff808646ee in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:609
#23 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal
(kgdb) f 16
#16 0xffffffff81900c9a in dmu_objset_find_dp_impl (dcp=0xfffff80078cb0200) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:1630
1630		if (*dcp->dc_error != 0)
(kgdb) p *dcp
$1 = {dc_tq = 0xfffff80060937470, dc_dp = 0xfffff8001c3d3000, dc_ddobj = 4996, dc_func = 0xffffffff819a2320 <zil_claim>, dc_arg = 0xfffff80051c72200, dc_flags = 2, dc_error_lock = 0xfffffe0094653a48, 
  dc_error = 0xfffffe0094653a80}
(kgdb) p *dcp->dc_error
Cannot access memory at address 0xfffffe0094653a80
(kgdb) p *dcp->dc_error_lock
Cannot access memory at address 0xfffffe0094653a48
(kgdb) p *dcp->dc_tq
$2 = {tq_queue = 0xfffff800666b2500}
(kgdb) p *dcp->dc_dp
$3 = {dp_spa = 0xfffff800062eb000, dp_meta_objset = 0xfffff8003db0a400, dp_root_dir = 0xfffff8000b0ba800, dp_mos_dir = 0xfffff8001cf05800, dp_free_dir = 0xfffff8004ca79400, dp_leak_dir = 0x0, 
  dp_origin_snap = 0xfffff8001a1d8400, dp_root_dir_obj = 2, dp_vnrele_taskq = 0xfffff800285be8e0, dp_meta_rootbp = {blk_dva = 0xfffff8001c3d3048, blk_prop = 9226475966770118659, blk_pad = 0xfffff8001c3d3080, 
    blk_phys_birth = 0, blk_birth = 27427, blk_fill = 5434, blk_cksum = {zc_word = 0xfffff8001c3d30a8}}, dp_tmp_userrefs_obj = 0, dp_free_bpobj = {bpo_lock = {lock_object = {
        lo_name = 0xffffffff81a58f27 "bpo->bpo_lock", lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, bpo_os = 0xfffff8003db0a400, bpo_object = 11, bpo_epb = 1024, bpo_havecomp = 1 '\001', 
    bpo_havesubobj = 1 '\001', bpo_phys = 0xfffff8006a434e00, bpo_dbuf = 0xfffff800385c76c0, bpo_cached_dbuf = 0x0}, dp_bptree_obj = 0, dp_empty_bpobj = 45, dp_scan = 0xfffff8001c39dc00, dp_lock = {
    lock_object = {lo_name = 0xffffffff81a5f0a5 "dp->dp_lock", lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, dp_spaceavail_cv = {
    cv_description = 0xffffffff81a5f0b2 "dp->dp_spaceavail_cv", cv_waiters = 0}, dp_dirty_pertxg = 0xfffff8001c3d3168, dp_dirty_total = 0, dp_mos_used_delta = 0, dp_mos_compressed_delta = 0, 
  dp_mos_uncompressed_delta = 0, dp_last_wakeup = 0, dp_tx = {tx_cpu = 0xfffff8003db0a800, tx_sync_lock = {lock_object = {lo_name = 0xffffffff81a66a91 "tx->tx_sync_lock", lo_flags = 40960000, lo_data = 0, 
        lo_witness = 0x0}, sx_lock = 1}, tx_open_txg = 27428, tx_quiesced_txg = 0, tx_syncing_txg = 0, tx_synced_txg = 0, tx_open_time = 0, tx_sync_txg_waiting = 0, tx_quiesce_txg_waiting = 0, 
    tx_sync_more_cv = {cv_description = 0xffffffff81a66aa3 "tx->tx_sync_more_cv", cv_waiters = 0}, tx_sync_done_cv = {cv_description = 0xffffffff81a66ab8 "tx->tx_sync_done_cv", cv_waiters = 0}, 
    tx_quiesce_more_cv = {cv_description = 0xffffffff81a66acd "tx->tx_quiesce_more_cv", cv_waiters = 0}, tx_quiesce_done_cv = {cv_description = 0xffffffff81a66ae5 "tx->tx_quiesce_done_cv", cv_waiters = 0}, 
    tx_timeout_cv = {cv_description = 0x0, cv_waiters = 0}, tx_exit_cv = {cv_description = 0xffffffff81a66afd "tx->tx_exit_cv", cv_waiters = 0}, tx_threads = 0 '\0', tx_exiting = 0 '\0', tx_sync_thread = 0x0, 
    tx_quiesce_thread = 0x0, tx_commit_cb_taskq = 0x0}, dp_dirty_datasets = {tl_lock = {lock_object = {lo_name = 0xffffffff81a66b4b "tl->tl_lock", lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, 
      sx_lock = 1}, tl_offset = 376, tl_head = 0xfffff8001c3d32b8}, dp_dirty_zilogs = {tl_lock = {lock_object = {lo_name = 0xffffffff81a66b4b "tl->tl_lock", lo_flags = 40960000, lo_data = 0, 
        lo_witness = 0x0}, sx_lock = 1}, tl_offset = 984, tl_head = 0xfffff8001c3d3300}, dp_dirty_dirs = {tl_lock = {lock_object = {lo_name = 0xffffffff81a66b4b "tl->tl_lock", lo_flags = 40960000, 
        lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, tl_offset = 80, tl_head = 0xfffff8001c3d3348}, dp_sync_tasks = {tl_lock = {lock_object = {lo_name = 0xffffffff81a66b4b "tl->tl_lock", lo_flags = 40960000, 
        lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, tl_offset = 0, tl_head = 0xfffff8001c3d3390}, dp_config_rwlock = {rr_lock = {lock_object = {lo_name = 0xffffffff81a62115 "rrl->rr_lock", 
        lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, rr_cv = {cv_description = 0xffffffff81a62123 "rrl->rr_cv", cv_waiters = 0}, rr_writer = 0x0, rr_anon_rcount = {rc_count = 0}, 
    rr_linked_rcount = {rc_count = 3}, rr_writer_wanted = 0, rr_track_all = 1}, dp_blkstats = 0x0}
(kgdb) p *dcp->dc_dp->dp_spa
$4 = {spa_name = 0xfffff800062eb000 "tor3", spa_comment = 0x0, spa_avl = {avl_child = 0xfffff800062eb108, avl_pcb = 18446735277720248589}, spa_config = 0xfffff800608f4a20, spa_config_syncing = 0x0, 
  spa_config_splitting = 0x0, spa_load_info = 0xfffff800061d99c0, spa_config_txg = 27427, spa_sync_pass = 0, spa_state = POOL_STATE_ACTIVE, spa_inject_ref = 0, spa_sync_on = 0 '\0', 
  spa_load_state = SPA_LOAD_OPEN, spa_import_flags = 0, spa_zio_taskq = 0xfffff800062eb168, spa_dsl_pool = 0xfffff8001c3d3000, spa_is_initializing = 0, spa_normal_class = 0xfffff8001e504400, 
  spa_log_class = 0xfffff8004dd19000, spa_first_txg = 27428, spa_final_txg = 18446744073709551615, spa_freeze_txg = 18446744073709551615, spa_load_max_txg = 18446744073709551615, spa_claim_max_txg = 27428, 
  spa_loaded_ts = {tv_sec = 1450736623, tv_nsec = 578802346}, spa_meta_objset = 0xfffff8003db0a400, spa_evicting_os_lock = {lock_object = {lo_name = 0xffffffff81a660ce "spa->spa_evicting_os_lock", 
      lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, spa_evicting_os_list = {list_size = 1008, list_offset = 256, list_head = {list_next = 0xfffff800062eb378, 
      list_prev = 0xfffff800062eb378}}, spa_evicting_os_cv = {cv_description = 0xffffffff81a6619b "spa->spa_evicting_os_cv", cv_waiters = 0}, spa_vdev_txg_list = {tl_lock = {lock_object = {
        lo_name = 0xffffffff81a66b4b "tl->tl_lock", lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, tl_offset = 624, tl_head = 0xfffff800062eb3c0}, spa_root_vdev = 0xfffff8006ab1a000, 
  spa_min_ashift = 12, spa_max_ashift = 12, spa_config_guid = 13271854445444309143, spa_load_guid = 10536853844155556401, spa_last_synced_guid = 0, spa_config_dirty_list = {list_size = 1704, 
    list_offset = 672, list_head = {list_next = 0xfffff800062eb418, list_prev = 0xfffff800062eb418}}, spa_state_dirty_list = {list_size = 1704, list_offset = 688, list_head = {list_next = 0xfffff800062eb438, 
      list_prev = 0xfffff800062eb438}}, spa_spares = {sav_object = 0, sav_config = 0x0, sav_vdevs = 0x0, sav_count = 0, sav_sync = 0, sav_pending = 0x0, sav_npending = 0}, spa_l2cache = {sav_object = 0, 
    sav_config = 0x0, sav_vdevs = 0x0, sav_count = 0, sav_sync = 0, sav_pending = 0x0, sav_npending = 0}, spa_label_features = 0xfffff800061d99a0, spa_config_object = 27, spa_config_generation = 0, 
  spa_syncing_txg = 0, spa_deferred_bpobj = {bpo_lock = {lock_object = {lo_name = 0xffffffff81a58f27 "bpo->bpo_lock", lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, 
    bpo_os = 0xfffff8003db0a400, bpo_object = 31, bpo_epb = 128, bpo_havecomp = 1 '\001', bpo_havesubobj = 1 '\001', bpo_phys = 0xfffff80006330a00, bpo_dbuf = 0xfffff800385c75e8, bpo_cached_dbuf = 0x0}, 
  spa_free_bplist = 0xfffff800062eb518, spa_cksum_salt = {zcs_bytes = 0xfffff800062eb618 "üì`~-ªY"}, spa_cksum_tmpls_lock = {lock_object = {lo_name = 0xffffffff81a66129 "spa->spa_cksum_tmpls_lock", 
      lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, spa_cksum_tmpls = 0xfffff800062eb658, spa_ubsync = {ub_magic = 12235020, ub_version = 5000, ub_txg = 27427, 
    ub_guid_sum = 3109696218321734419, ub_timestamp = 1450604034, ub_rootbp = {blk_dva = 0xfffff800062eb6d8, blk_prop = 9226475966770118659, blk_pad = 0xfffff800062eb710, blk_phys_birth = 0, 
      blk_birth = 27427, blk_fill = 5434, blk_cksum = {zc_word = 0xfffff800062eb738}}, ub_software_version = 5000}, spa_uberblock = {ub_magic = 12235020, ub_version = 5000, ub_txg = 27427, 
    ub_guid_sum = 3109696218321734419, ub_timestamp = 1450604034, ub_rootbp = {blk_dva = 0xfffff800062eb788, blk_prop = 9226475966770118659, blk_pad = 0xfffff800062eb7c0, blk_phys_birth = 0, 
      blk_birth = 27427, blk_fill = 5434, blk_cksum = {zc_word = 0xfffff800062eb7e8}}, ub_software_version = 5000}, spa_extreme_rewind = 0, spa_last_io = 11912224, spa_scrub_lock = {lock_object = {
      lo_name = 0xffffffff81a66144 "spa->spa_scrub_lock", lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, spa_scrub_inflight = 0, spa_scrub_io_cv = {
    cv_description = 0xffffffff81a661c6 "spa->spa_scrub_io_cv", cv_waiters = 0}, spa_scrub_active = 0 '\0', spa_scrub_type = 0 '\0', spa_scrub_finished = 0 '\0', spa_scrub_started = 0 '\0', 
  spa_scrub_reopen = 0 '\0', spa_scan_pass_start = 1450736643, spa_scan_pass_exam = 0, spa_async_lock = {lock_object = {lo_name = 0xffffffff81a6608c "spa->spa_async_lock", lo_flags = 40960000, lo_data = 0, 
      lo_witness = 0x0}, sx_lock = 1}, spa_async_thread = 0x0, spa_async_thread_vd = 0x0, spa_async_suspended = 0, spa_async_cv = {cv_description = 0xffffffff81a66188 "spa->spa_async_cv", cv_waiters = 0}, 
  spa_async_tasks = 0, spa_root = 0x0, spa_ena = 0, spa_last_open_failed = 2, spa_last_ubsync_txg = 0, spa_last_ubsync_txg_ts = 0, spa_load_txg = 27427, spa_load_txg_ts = 1450604034, spa_load_meta_errors = 0, 
  spa_load_data_errors = 0, spa_verify_min_txg = 27424, spa_errlog_lock = {lock_object = {lo_name = 0xffffffff81a660b8 "spa->spa_errlog_lock", lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, 
    sx_lock = 1}, spa_errlog_last = 0, spa_errlog_scrub = 0, spa_errlist_lock = {lock_object = {lo_name = 0xffffffff81a660a1 "spa->spa_errlist_lock", lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, 
    sx_lock = 1}, spa_errlist_last = {avl_root = 0x0, avl_compar = 0xffffffff8194ef20 <spa_error_entry_compare>, avl_offset = 40, avl_numnodes = 0, avl_size = 64}, spa_errlist_scrub = {avl_root = 0x0, 
    avl_compar = 0xffffffff8194ef20 <spa_error_entry_compare>, avl_offset = 40, avl_numnodes = 0, avl_size = 64}, spa_deflate = 1, spa_history = 32, spa_history_lock = {lock_object = {
      lo_name = 0xffffffff81a660e9 "spa->spa_history_lock", lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, spa_pending_vdev = 0x0, spa_props_lock = {lock_object = {
      lo_name = 0xffffffff81a66114 "spa->spa_props_lock", lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, spa_pool_props_object = 33, spa_bootfs = 0, spa_failmode = 1, spa_delegation = 1, 
  spa_config_list = {list_size = 24, list_offset = 0, list_head = {list_next = 0xfffff800061d9a00, list_prev = 0xfffff800061d9a00}}, spa_async_zio_root = 0xfffff80025dfeb50, spa_suspend_zio_root = 0x0, 
  spa_suspend_lock = {lock_object = {lo_name = 0xffffffff81a66159 "spa->spa_suspend_lock", lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, spa_suspend_cv = {
    cv_description = 0xffffffff81a661dc "spa->spa_suspend_cv", cv_waiters = 0}, spa_suspended = 0 '\0', spa_claiming = 1 '\001', spa_debug = 0, spa_is_root = 0, spa_minref = 0, spa_mode = 3, 
  spa_log_state = SPA_LOG_UNKNOWN, spa_autoexpand = 0, spa_ddt = 0xfffff800062ebaa8, spa_ddt_stat_object = 0, spa_dedup_ditto = 0, spa_dedup_checksum = 8, spa_dspace = 10926396801024, spa_vdev_top_lock = {
    lock_object = {lo_name = 0xffffffff81a66170 "spa->spa_vdev_top_lock", lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, spa_proc_lock = {lock_object = {
      lo_name = 0xffffffff81a66100 "spa->spa_proc_lock", lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, spa_proc_cv = {cv_description = 0xffffffff81a661b4 "spa->spa_proc_cv", 
    cv_waiters = 0}, spa_proc_state = SPA_PROC_NONE, spa_proc = 0xffffffff81620028, spa_did = 0, spa_trim_thread = 0xfffff80056b074d0, spa_trim_lock = {lock_object = {
      lo_name = 0xffffffff81a6dfe1 "spa->spa_trim_lock", lo_flags = 40960000, lo_data = 0, lo_witness = 0x0}, sx_lock = 1}, spa_trim_cv = {cv_description = 0xffffffff81a6dff5 "spa->spa_trim_cv", 
    cv_waiters = 131}, spa_autoreplace = 1, spa_vdev_locks = 0, spa_creation_version = 5000, spa_prev_software_version = 5000, spa_feat_for_write_obj = 29, spa_feat_for_read_obj = 28, spa_feat_desc_obj = 30, 
  spa_feat_enabled_txg_obj = 34, spa_feat_refcount_cache = 0xfffff800062ebbf8, spa_deadman_cycid = {c_links = {le = {le_next = 0x0, le_prev = 0x0}, sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, 
        tqe_prev = 0x0}}, c_time = 0, c_precision = 0, c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 0, c_iflags = 16, c_cpu = 0}, spa_deadman_calls = 0, spa_sync_starttime = 0, 
  spa_deadman_synctime = 1000000000000, spa_ccw_fail_time = 0, spa_config_lock = 0xfffff800062ebcb8, spa_refcount = {rc_count = 31}, spa_splitting_newspa = 0}

tor3 is the imported ZFS pool.

Given the location of the trap, this could be a regression caused
by the import of illumos #5269 (zpool import slow) in r286686:
https://svnweb.freebsd.org/base?view=revision&revision=r286686

Fabian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20151222/b6e29e86/attachment.sig>


More information about the freebsd-fs mailing list