zfs meta data slowness
Ronald Klop
ronald-lists at klop.ws
Wed Jul 22 11:24:34 UTC 2020
Van: mike tancsa <mike at sentex.net>
Datum: dinsdag, 21 juli 2020 21:37
Aan: Ronald Klop <ronald-lists at klop.ws>, FreeBSD-STABLE Mailing List <freebsd-stable at freebsd.org>
Onderwerp: Re: zfs meta data slowness
>
> Hi,
> Thanks for the response. Reply in line
>
> On 7/20/2020 9:04 AM, Ronald Klop wrote:
> > Hi,
> >
> > My first suggestion would be to remove a lot of snapshots. But that my
> > not match your business case.
>
> As its a backup server, its sort of the point to have all those snapshots.
>
>
> > Maybe you can provide more information about your setup:
> > Amount of RAM, CPU?
> 64G, Xeon(R) CPU E3-1240 v6 @ 3.70GHz
> > output of "zpool status"
> # zpool status -x
>
> all pools are healthy
>
That is nice to know.
Instead of zpool status -x, the output of "zpool status" is very interesting. And "zpool list" also. That gives the reader information about your setup, which helps thinking along about the possible cause.
But as somebody else mentioned. Profiling the kernel might be the best thing to do. Dtrace can be used for it. But I don't know these commands by heart.
If I remember correctly there is an optimization for "zfs list -o name". This is much faster because it does not get extra information from the disks.
See: https://svnweb.freebsd.org/base?view=revision&revision=230438
Regards,
Ronald.
>
> > output of "zfs list" if possible to share
>
> its a big list
>
> # zfs list | wc
> 824 4120 107511
>
>
> > Type of disks/ssds?
> old school Device Model: WDC WD80EFAX-68KNBN0
> > What is the load of the system? I/O per second, etc.
> its not cpu bound, disks are sometimes running at 100% based on gstat,
> but not always
> > Do you use dedup, GELI?
>
> no and no
>
>
> > Something else special about the setup.
> > output of "top -b"
> >
>
> ports are right now being built in a VM, but the problem (zrepl hanging)
> and zfs list -t snapshots taking forever happens regardless
>
> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU
> COMMAND
> 4439 root 12 40 20 6167M 5762M kqread 3 535:13 200.00% bhyve
> 98783 root 2 21 0 16M 5136K hdr->b 4 0:01 1.95% zfs
> 76489 root 21 23 0 738M 54M uwait 1 2:18 0.88% zrepl
> 98784 root 1 21 0 13M 3832K piperd 3 0:01 0.59% zfs
> 99563 root 1 20 0 13M 4136K zio->i 4 0:00 0.39% zfs
> 16136 root 18 25 0 705M 56M uwait 3 29:58 0.00%
> zrepl-freebsd-amd64
> 1845 root 1 20 0 12M 3772K nanslp 7 5:54 0.00%
> ossec-syscheckd
> 1567 root 1 20 0 11M 2744K select 0 2:22 0.00%
> syslogd
> 1737 root 32 20 0 11M 2844K rpcsvc 6 1:40 0.00% nfsd
> 1660 root 1 -52 r0 11M 11M nanslp 5 1:18 0.00%
> watchdogd
> 1434 root 1 20 0 9988K 988K select 3 0:27 0.00% devd
> 2435 mdtancsa 1 20 0 20M 8008K select 0 0:21 0.00% sshd
> 1754 root 3 20 0 18M 3556K select 1 0:11 0.00%
> apcupsd
> 5917 root 1 20 0 11M 2672K select 2 0:06 0.00%
> script
> 1449 _pflogd 1 20 0 12M 3572K bpf 3 0:05 0.00%
> pflogd
>
> ---Mike
>
> > That kind of information.
> >
> > Regards,
> > Ronald.
> >
> >
> > Van: mike tancsa <mike at sentex.net>
> > Datum: zondag, 19 juli 2020 16:17
> > Aan: FreeBSD-STABLE Mailing List <freebsd-stable at freebsd.org>
> > Onderwerp: zfs meta data slowness
> >>
> >> Are there any tweaks that can be done to speed up or improve zfs
> >> metadata performance ? I have a backup server with a lot of snapshots
> >> (40,000) and just doing a listing can take a great deal of time. Best
> >> case scenario is about 24 seconds, worst case, I have seen it up to 15
> >> minutes. (FreeBSD 12.1-STABLE r363078)
> >>
> >>
> >> ARC Efficiency: 79.33b
> >> Cache Hit Ratio: 92.81% 73.62b
> >> Cache Miss Ratio: 7.19% 5.71b
> >> Actual Hit Ratio: 92.78% 73.60b
> >>
> >> Data Demand Efficiency: 96.47% 461.91m
> >> Data Prefetch Efficiency: 1.00% 262.73m
> >>
> >> CACHE HITS BY CACHE LIST:
> >> Anonymously Used: 0.01% 3.86m
> >> Most Recently Used: 3.91% 2.88b
> >> Most Frequently Used: 96.06% 70.72b
> >> Most Recently Used Ghost: 0.01% 5.31m
> >> Most Frequently Used Ghost: 0.01% 10.47m
> >>
> >> CACHE HITS BY DATA TYPE:
> >> Demand Data: 0.61% 445.60m
> >> Prefetch Data: 0.00% 2.63m
> >> Demand Metadata: 99.36% 73.15b
> >> Prefetch Metadata: 0.03% 21.00m
> >>
> >> CACHE MISSES BY DATA TYPE:
> >> Demand Data: 0.29% 16.31m
> >> Prefetch Data: 4.56% 260.10m
> >> Demand Metadata: 95.02% 5.42b
> >> Prefetch Metadata: 0.14% 7.75m
> >>
> >>
> >> Other than increase the metadata max, I havent really changed any
> >> tuneables
> >>
> >>
> >> ZFS Tunables (sysctl):
> >> kern.maxusers 4416
> >> vm.kmem_size 66691842048
> >> vm.kmem_size_scale 1
> >> vm.kmem_size_min 0
> >> vm.kmem_size_max 1319413950874
> >> vfs.zfs.trim.max_interval 1
> >> vfs.zfs.trim.timeout 30
> >> vfs.zfs.trim.txg_delay 32
> >> vfs.zfs.trim.enabled 1
> >> vfs.zfs.vol.immediate_write_sz 32768
> >> vfs.zfs.vol.unmap_sync_enabled 0
> >> vfs.zfs.vol.unmap_enabled 1
> >> vfs.zfs.vol.recursive 0
> >> vfs.zfs.vol.mode 1
> >> vfs.zfs.version.zpl 5
> >> vfs.zfs.version.spa 5000
> >> vfs.zfs.version.acl 1
> >> vfs.zfs.version.ioctl 7
> >> vfs.zfs.debug 0
> >> vfs.zfs.super_owner 0
> >> vfs.zfs.immediate_write_sz 32768
> >> vfs.zfs.sync_pass_rewrite 2
> >> vfs.zfs.sync_pass_dont_compress 5
> >> vfs.zfs.sync_pass_deferred_free 2
> >> vfs.zfs.zio.dva_throttle_enabled 1
> >> vfs.zfs.zio.exclude_metadata 0
> >> vfs.zfs.zio.use_uma 1
> >> vfs.zfs.zio.taskq_batch_pct 75
> >> vfs.zfs.zil_maxblocksize 131072
> >> vfs.zfs.zil_slog_bulk 786432
> >> vfs.zfs.zil_nocacheflush 0
> >> vfs.zfs.zil_replay_disable 0
> >> vfs.zfs.cache_flush_disable 0
> >> vfs.zfs.standard_sm_blksz 131072
> >> vfs.zfs.dtl_sm_blksz 4096
> >> vfs.zfs.min_auto_ashift 9
> >> vfs.zfs.max_auto_ashift 13
> >> vfs.zfs.vdev.trim_max_pending 10000
> >> vfs.zfs.vdev.bio_delete_disable 0
> >> vfs.zfs.vdev.bio_flush_disable 0
> >> vfs.zfs.vdev.def_queue_depth 32
> >> vfs.zfs.vdev.queue_depth_pct 1000
> >> vfs.zfs.vdev.write_gap_limit 4096
> >> vfs.zfs.vdev.read_gap_limit 32768
> >> vfs.zfs.vdev.aggregation_limit_non_rotating131072
> >> vfs.zfs.vdev.aggregation_limit 1048576
> >> vfs.zfs.vdev.initializing_max_active 1
> >> vfs.zfs.vdev.initializing_min_active 1
> >> vfs.zfs.vdev.removal_max_active 2
> >> vfs.zfs.vdev.removal_min_active 1
> >> vfs.zfs.vdev.trim_max_active 64
> >> vfs.zfs.vdev.trim_min_active 1
> >> vfs.zfs.vdev.scrub_max_active 2
> >> vfs.zfs.vdev.scrub_min_active 1
> >> vfs.zfs.vdev.async_write_max_active 10
> >> vfs.zfs.vdev.async_write_min_active 1
> >> vfs.zfs.vdev.async_read_max_active 3
> >> vfs.zfs.vdev.async_read_min_active 1
> >> vfs.zfs.vdev.sync_write_max_active 10
> >> vfs.zfs.vdev.sync_write_min_active 10
> >> vfs.zfs.vdev.sync_read_max_active 10
> >> vfs.zfs.vdev.sync_read_min_active 10
> >> vfs.zfs.vdev.max_active 1000
> >> vfs.zfs.vdev.async_write_active_max_dirty_percent60
> >> vfs.zfs.vdev.async_write_active_min_dirty_percent30
> >> vfs.zfs.vdev.mirror.non_rotating_seek_inc1
> >> vfs.zfs.vdev.mirror.non_rotating_inc 0
> >> vfs.zfs.vdev.mirror.rotating_seek_offset1048576
> >> vfs.zfs.vdev.mirror.rotating_seek_inc 5
> >> vfs.zfs.vdev.mirror.rotating_inc 0
> >> vfs.zfs.vdev.trim_on_init 1
> >> vfs.zfs.vdev.cache.bshift 16
> >> vfs.zfs.vdev.cache.size 0
> >> vfs.zfs.vdev.cache.max 16384
> >> vfs.zfs.vdev.validate_skip 0
> >> vfs.zfs.vdev.max_ms_shift 34
> >> vfs.zfs.vdev.default_ms_shift 29
> >> vfs.zfs.vdev.max_ms_count_limit 131072
> >> vfs.zfs.vdev.min_ms_count 16
> >> vfs.zfs.vdev.default_ms_count 200
> >> vfs.zfs.txg.timeout 5
> >> vfs.zfs.space_map_ibs 14
> >> vfs.zfs.special_class_metadata_reserve_pct25
> >> vfs.zfs.user_indirect_is_special 1
> >> vfs.zfs.ddt_data_is_special 1
> >> vfs.zfs.spa_allocators 4
> >> vfs.zfs.spa_min_slop 134217728
> >> vfs.zfs.spa_slop_shift 5
> >> vfs.zfs.spa_asize_inflation 24
> >> vfs.zfs.deadman_enabled 1
> >> vfs.zfs.deadman_checktime_ms 5000
> >> vfs.zfs.deadman_synctime_ms 1000000
> >> vfs.zfs.debugflags 0
> >> vfs.zfs.recover 0
> >> vfs.zfs.spa_load_verify_data 1
> >> vfs.zfs.spa_load_verify_metadata 1
> >> vfs.zfs.spa_load_verify_maxinflight 10000
> >> vfs.zfs.max_missing_tvds_scan 0
> >> vfs.zfs.max_missing_tvds_cachefile 2
> >> vfs.zfs.max_missing_tvds 0
> >> vfs.zfs.spa_load_print_vdev_tree 0
> >> vfs.zfs.ccw_retry_interval 300
> >> vfs.zfs.check_hostid 1
> >> vfs.zfs.multihost_fail_intervals 10
> >> vfs.zfs.multihost_import_intervals 20
> >> vfs.zfs.multihost_interval 1000
> >> vfs.zfs.mg_fragmentation_threshold 85
> >> vfs.zfs.mg_noalloc_threshold 0
> >> vfs.zfs.condense_pct 200
> >> vfs.zfs.metaslab_sm_blksz 4096
> >> vfs.zfs.metaslab.bias_enabled 1
> >> vfs.zfs.metaslab.lba_weighting_enabled 1
> >> vfs.zfs.metaslab.fragmentation_factor_enabled1
> >> vfs.zfs.metaslab.preload_enabled 1
> >> vfs.zfs.metaslab.preload_limit 3
> >> vfs.zfs.metaslab.unload_delay 8
> >> vfs.zfs.metaslab.load_pct 50
> >> vfs.zfs.metaslab.min_alloc_size 33554432
> >> vfs.zfs.metaslab.df_free_pct 4
> >> vfs.zfs.metaslab.df_alloc_threshold 131072
> >> vfs.zfs.metaslab.debug_unload 0
> >> vfs.zfs.metaslab.debug_load 0
> >> vfs.zfs.metaslab.fragmentation_threshold70
> >> vfs.zfs.metaslab.force_ganging 16777217
> >> vfs.zfs.free_bpobj_enabled 1
> >> vfs.zfs.free_max_blocks -1
> >> vfs.zfs.zfs_scan_checkpoint_interval 7200
> >> vfs.zfs.zfs_scan_legacy 0
> >> vfs.zfs.no_scrub_prefetch 0
> >> vfs.zfs.no_scrub_io 0
> >> vfs.zfs.resilver_min_time_ms 3000
> >> vfs.zfs.free_min_time_ms 1000
> >> vfs.zfs.scan_min_time_ms 1000
> >> vfs.zfs.scan_idle 50
> >> vfs.zfs.scrub_delay 4
> >> vfs.zfs.resilver_delay 2
> >> vfs.zfs.zfetch.array_rd_sz 1048576
> >> vfs.zfs.zfetch.max_idistance 67108864
> >> vfs.zfs.zfetch.max_distance 8388608
> >> vfs.zfs.zfetch.min_sec_reap 2
> >> vfs.zfs.zfetch.max_streams 8
> >> vfs.zfs.prefetch_disable 0
> >> vfs.zfs.delay_scale 500000
> >> vfs.zfs.delay_min_dirty_percent 60
> >> vfs.zfs.dirty_data_sync_pct 20
> >> vfs.zfs.dirty_data_max_percent 10
> >> vfs.zfs.dirty_data_max_max 4294967296
> >> vfs.zfs.dirty_data_max 4294967296
> >> vfs.zfs.max_recordsize 1048576
> >> vfs.zfs.default_ibs 17
> >> vfs.zfs.default_bs 9
> >> vfs.zfs.send_holes_without_birth_time 1
> >> vfs.zfs.mdcomp_disable 0
> >> vfs.zfs.per_txg_dirty_frees_percent 5
> >> vfs.zfs.nopwrite_enabled 1
> >> vfs.zfs.dedup.prefetch 1
> >> vfs.zfs.dbuf_cache_lowater_pct 10
> >> vfs.zfs.dbuf_cache_hiwater_pct 10
> >> vfs.zfs.dbuf_metadata_cache_overflow 0
> >> vfs.zfs.dbuf_metadata_cache_shift 6
> >> vfs.zfs.dbuf_cache_shift 5
> >> vfs.zfs.dbuf_metadata_cache_max_bytes 1025282816
> >> vfs.zfs.dbuf_cache_max_bytes 2050565632
> >> vfs.zfs.arc_min_prescient_prefetch_ms 6
> >> vfs.zfs.arc_min_prefetch_ms 1
> >> vfs.zfs.l2c_only_size 0
> >> vfs.zfs.mfu_ghost_data_esize 7778263552
> >> vfs.zfs.mfu_ghost_metadata_esize 16851792896
> >> vfs.zfs.mfu_ghost_size 24630056448
> >> vfs.zfs.mfu_data_esize 3059418112
> >> vfs.zfs.mfu_metadata_esize 28641792
> >> vfs.zfs.mfu_size 6399023104
> >> vfs.zfs.mru_ghost_data_esize 2199812096
> >> vfs.zfs.mru_ghost_metadata_esize 6289682432
> >> vfs.zfs.mru_ghost_size 8489494528
> >> vfs.zfs.mru_data_esize 22781456384
> >> vfs.zfs.mru_metadata_esize 309155840
> >> vfs.zfs.mru_size 23847875584
> >> vfs.zfs.anon_data_esize 0
> >> vfs.zfs.anon_metadata_esize 0
> >> vfs.zfs.anon_size 8556544
> >> vfs.zfs.l2arc_norw 1
> >> vfs.zfs.l2arc_feed_again 1
> >> vfs.zfs.l2arc_noprefetch 1
> >> vfs.zfs.l2arc_feed_min_ms 200
> >> vfs.zfs.l2arc_feed_secs 1
> >> vfs.zfs.l2arc_headroom 2
> >> vfs.zfs.l2arc_write_boost 8388608
> >> vfs.zfs.l2arc_write_max 8388608
> >> vfs.zfs.arc_meta_strategy 1
> >> vfs.zfs.arc_meta_limit 15833624576
> >> vfs.zfs.arc_free_target 346902
> >> vfs.zfs.arc_kmem_cache_reap_retry_ms 1000
> >> vfs.zfs.compressed_arc_enabled 1
> >> vfs.zfs.arc_grow_retry 60
> >> vfs.zfs.arc_shrink_shift 7
> >> vfs.zfs.arc_average_blocksize 8192
> >> vfs.zfs.arc_no_grow_shift 5
> >> vfs.zfs.arc_min 8202262528
> >> vfs.zfs.arc_max 39334498304
> >> vfs.zfs.abd_chunk_size 4096
> >> vfs.zfs.abd_scatter_enabled 1
> >>
> >> _______________________________________________
> >> freebsd-stable at freebsd.org mailing list
> >> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> >> To unsubscribe, send any mail to
> >> "freebsd-stable-unsubscribe at freebsd.org"
> >>
> >>
> >>
> > _______________________________________________
> > freebsd-stable at freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
> >
>
More information about the freebsd-stable
mailing list