zfs meta data slowness

Ronald Klop ronald-lists at klop.ws
Wed Jul 22 11:24:34 UTC 2020


 
Van: mike tancsa <mike at sentex.net>
Datum: dinsdag, 21 juli 2020 21:37
Aan: Ronald Klop <ronald-lists at klop.ws>, FreeBSD-STABLE Mailing List <freebsd-stable at freebsd.org>
Onderwerp: Re: zfs meta data slowness
> 
> Hi,
>     Thanks for the response. Reply in line
> 
> On 7/20/2020 9:04 AM, Ronald Klop wrote:
> > Hi,
> >
> > My first suggestion would be to remove a lot of snapshots. But that my
> > not match your business case.
> 
> As its a backup server, its sort of the point to have all those snapshots.
> 
> 
> > Maybe you can provide more information about your setup:
> > Amount of RAM, CPU?
> 64G, Xeon(R) CPU E3-1240 v6 @ 3.70GHz
> > output of "zpool status"
> # zpool status -x
> 
> all pools are healthy
>  

That is nice to know.
Instead of zpool status -x, the output of "zpool status" is very interesting. And "zpool list" also. That gives the reader information about your setup, which helps thinking along about the possible cause.

But as somebody else mentioned. Profiling the kernel might be the best thing to do. Dtrace can be used for it. But I don't know these commands by heart.

If I remember correctly there is an optimization for "zfs list -o name". This is much faster because it does not get extra information from the disks.
See: https://svnweb.freebsd.org/base?view=revision&revision=230438

Regards,
Ronald.

 > 
> > output of "zfs list" if possible to share
> 
> its a big list
> 
> # zfs list | wc
>      824    4120  107511
> 
> 
> > Type of disks/ssds?
> old school Device Model:     WDC WD80EFAX-68KNBN0
> > What is the load of the system? I/O per second, etc.
> its not cpu bound, disks are sometimes running at 100% based on gstat,
> but not always
> > Do you use dedup, GELI?
> 
> no and no
> 
> 
> > Something else special about the setup.
> > output of "top -b"
> >
> 
> ports are right now being built in a VM, but the problem (zrepl hanging)
> and zfs list -t snapshots taking forever happens regardless
> 
>   PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU
> COMMAND
>  4439 root         12  40   20  6167M  5762M kqread   3 535:13 200.00% bhyve
> 98783 root          2  21    0    16M  5136K hdr->b   4   0:01   1.95% zfs
> 76489 root         21  23    0   738M    54M uwait    1   2:18   0.88% zrepl
> 98784 root          1  21    0    13M  3832K piperd   3   0:01   0.59% zfs
> 99563 root          1  20    0    13M  4136K zio->i   4   0:00   0.39% zfs
> 16136 root         18  25    0   705M    56M uwait    3  29:58   0.00%
> zrepl-freebsd-amd64
>  1845 root          1  20    0    12M  3772K nanslp   7   5:54   0.00%
> ossec-syscheckd
>  1567 root          1  20    0    11M  2744K select   0   2:22   0.00%
> syslogd
>  1737 root         32  20    0    11M  2844K rpcsvc   6   1:40   0.00% nfsd
>  1660 root          1 -52   r0    11M    11M nanslp   5   1:18   0.00%
> watchdogd
>  1434 root          1  20    0  9988K   988K select   3   0:27   0.00% devd
>  2435 mdtancsa      1  20    0    20M  8008K select   0   0:21   0.00% sshd
>  1754 root          3  20    0    18M  3556K select   1   0:11   0.00%
> apcupsd
>  5917 root          1  20    0    11M  2672K select   2   0:06   0.00%
> script
>  1449 _pflogd       1  20    0    12M  3572K bpf      3   0:05   0.00%
> pflogd
> 
>     ---Mike
> 
> > That kind of information.
> >
> > Regards,
> > Ronald.
> >
> >
> > Van: mike tancsa <mike at sentex.net>
> > Datum: zondag, 19 juli 2020 16:17
> > Aan: FreeBSD-STABLE Mailing List <freebsd-stable at freebsd.org>
> > Onderwerp: zfs meta data slowness
> >>
> >> Are there any tweaks that can be done to speed up or improve zfs
> >> metadata performance ? I have a backup server with a lot of snapshots
> >> (40,000)  and just doing a listing can take a great deal of time.  Best
> >> case scenario is about 24 seconds, worst case, I have seen it up to 15
> >> minutes.  (FreeBSD 12.1-STABLE r363078)
> >>
> >>
> >> ARC Efficiency:                                 79.33b
> >>         Cache Hit Ratio:                92.81%  73.62b
> >>         Cache Miss Ratio:               7.19%   5.71b
> >>         Actual Hit Ratio:               92.78%  73.60b
> >>
> >>         Data Demand Efficiency:         96.47%  461.91m
> >>         Data Prefetch Efficiency:       1.00%   262.73m
> >>
> >>         CACHE HITS BY CACHE LIST:
> >>           Anonymously Used:             0.01%   3.86m
> >>           Most Recently Used:           3.91%   2.88b
> >>           Most Frequently Used:         96.06%  70.72b
> >>           Most Recently Used Ghost:     0.01%   5.31m
> >>           Most Frequently Used Ghost:   0.01%   10.47m
> >>
> >>         CACHE HITS BY DATA TYPE:
> >>           Demand Data:                  0.61%   445.60m
> >>           Prefetch Data:                0.00%   2.63m
> >>           Demand Metadata:              99.36%  73.15b
> >>           Prefetch Metadata:            0.03%   21.00m
> >>
> >>         CACHE MISSES BY DATA TYPE:
> >>           Demand Data:                  0.29%   16.31m
> >>           Prefetch Data:                4.56%   260.10m
> >>           Demand Metadata:              95.02%  5.42b
> >>           Prefetch Metadata:            0.14%   7.75m
> >>
> >>
> >> Other than increase the metadata max, I havent really changed any
> >> tuneables
> >>
> >>
> >> ZFS Tunables (sysctl):
> >>         kern.maxusers                           4416
> >>         vm.kmem_size                            66691842048
> >>         vm.kmem_size_scale                      1
> >>         vm.kmem_size_min                        0
> >>         vm.kmem_size_max                        1319413950874
> >>         vfs.zfs.trim.max_interval               1
> >>         vfs.zfs.trim.timeout                    30
> >>         vfs.zfs.trim.txg_delay                  32
> >>         vfs.zfs.trim.enabled                    1
> >>         vfs.zfs.vol.immediate_write_sz          32768
> >>         vfs.zfs.vol.unmap_sync_enabled          0
> >>         vfs.zfs.vol.unmap_enabled               1
> >>         vfs.zfs.vol.recursive                   0
> >>         vfs.zfs.vol.mode                        1
> >>         vfs.zfs.version.zpl                     5
> >>         vfs.zfs.version.spa                     5000
> >>         vfs.zfs.version.acl                     1
> >>         vfs.zfs.version.ioctl                   7
> >>         vfs.zfs.debug                           0
> >>         vfs.zfs.super_owner                     0
> >>         vfs.zfs.immediate_write_sz              32768
> >>         vfs.zfs.sync_pass_rewrite               2
> >>         vfs.zfs.sync_pass_dont_compress         5
> >>         vfs.zfs.sync_pass_deferred_free         2
> >>         vfs.zfs.zio.dva_throttle_enabled        1
> >>         vfs.zfs.zio.exclude_metadata            0
> >>         vfs.zfs.zio.use_uma                     1
> >>         vfs.zfs.zio.taskq_batch_pct             75
> >>         vfs.zfs.zil_maxblocksize                131072
> >>         vfs.zfs.zil_slog_bulk                   786432
> >>         vfs.zfs.zil_nocacheflush                0
> >>         vfs.zfs.zil_replay_disable              0
> >>         vfs.zfs.cache_flush_disable             0
> >>         vfs.zfs.standard_sm_blksz               131072
> >>         vfs.zfs.dtl_sm_blksz                    4096
> >>         vfs.zfs.min_auto_ashift                 9
> >>         vfs.zfs.max_auto_ashift                 13
> >>         vfs.zfs.vdev.trim_max_pending           10000
> >>         vfs.zfs.vdev.bio_delete_disable         0
> >>         vfs.zfs.vdev.bio_flush_disable          0
> >>         vfs.zfs.vdev.def_queue_depth            32
> >>         vfs.zfs.vdev.queue_depth_pct            1000
> >>         vfs.zfs.vdev.write_gap_limit            4096
> >>         vfs.zfs.vdev.read_gap_limit             32768
> >>         vfs.zfs.vdev.aggregation_limit_non_rotating131072
> >>         vfs.zfs.vdev.aggregation_limit          1048576
> >>         vfs.zfs.vdev.initializing_max_active    1
> >>         vfs.zfs.vdev.initializing_min_active    1
> >>         vfs.zfs.vdev.removal_max_active         2
> >>         vfs.zfs.vdev.removal_min_active         1
> >>         vfs.zfs.vdev.trim_max_active            64
> >>         vfs.zfs.vdev.trim_min_active            1
> >>         vfs.zfs.vdev.scrub_max_active           2
> >>         vfs.zfs.vdev.scrub_min_active           1
> >>         vfs.zfs.vdev.async_write_max_active     10
> >>         vfs.zfs.vdev.async_write_min_active     1
> >>         vfs.zfs.vdev.async_read_max_active      3
> >>         vfs.zfs.vdev.async_read_min_active      1
> >>         vfs.zfs.vdev.sync_write_max_active      10
> >>         vfs.zfs.vdev.sync_write_min_active      10
> >>         vfs.zfs.vdev.sync_read_max_active       10
> >>         vfs.zfs.vdev.sync_read_min_active       10
> >>         vfs.zfs.vdev.max_active                 1000
> >>         vfs.zfs.vdev.async_write_active_max_dirty_percent60
> >>         vfs.zfs.vdev.async_write_active_min_dirty_percent30
> >>         vfs.zfs.vdev.mirror.non_rotating_seek_inc1
> >>         vfs.zfs.vdev.mirror.non_rotating_inc    0
> >>         vfs.zfs.vdev.mirror.rotating_seek_offset1048576
> >>         vfs.zfs.vdev.mirror.rotating_seek_inc   5
> >>         vfs.zfs.vdev.mirror.rotating_inc        0
> >>         vfs.zfs.vdev.trim_on_init               1
> >>         vfs.zfs.vdev.cache.bshift               16
> >>         vfs.zfs.vdev.cache.size                 0
> >>         vfs.zfs.vdev.cache.max                  16384
> >>         vfs.zfs.vdev.validate_skip              0
> >>         vfs.zfs.vdev.max_ms_shift               34
> >>         vfs.zfs.vdev.default_ms_shift           29
> >>         vfs.zfs.vdev.max_ms_count_limit         131072
> >>         vfs.zfs.vdev.min_ms_count               16
> >>         vfs.zfs.vdev.default_ms_count           200
> >>         vfs.zfs.txg.timeout                     5
> >>         vfs.zfs.space_map_ibs                   14
> >>         vfs.zfs.special_class_metadata_reserve_pct25
> >>         vfs.zfs.user_indirect_is_special        1
> >>         vfs.zfs.ddt_data_is_special             1
> >>         vfs.zfs.spa_allocators                  4
> >>         vfs.zfs.spa_min_slop                    134217728
> >>         vfs.zfs.spa_slop_shift                  5
> >>         vfs.zfs.spa_asize_inflation             24
> >>         vfs.zfs.deadman_enabled                 1
> >>         vfs.zfs.deadman_checktime_ms            5000
> >>         vfs.zfs.deadman_synctime_ms             1000000
> >>         vfs.zfs.debugflags                      0
> >>         vfs.zfs.recover                         0
> >>         vfs.zfs.spa_load_verify_data            1
> >>         vfs.zfs.spa_load_verify_metadata        1
> >>         vfs.zfs.spa_load_verify_maxinflight     10000
> >>         vfs.zfs.max_missing_tvds_scan           0
> >>         vfs.zfs.max_missing_tvds_cachefile      2
> >>         vfs.zfs.max_missing_tvds                0
> >>         vfs.zfs.spa_load_print_vdev_tree        0
> >>         vfs.zfs.ccw_retry_interval              300
> >>         vfs.zfs.check_hostid                    1
> >>         vfs.zfs.multihost_fail_intervals        10
> >>         vfs.zfs.multihost_import_intervals      20
> >>         vfs.zfs.multihost_interval              1000
> >>         vfs.zfs.mg_fragmentation_threshold      85
> >>         vfs.zfs.mg_noalloc_threshold            0
> >>         vfs.zfs.condense_pct                    200
> >>         vfs.zfs.metaslab_sm_blksz               4096
> >>         vfs.zfs.metaslab.bias_enabled           1
> >>         vfs.zfs.metaslab.lba_weighting_enabled  1
> >>         vfs.zfs.metaslab.fragmentation_factor_enabled1
> >>         vfs.zfs.metaslab.preload_enabled        1
> >>         vfs.zfs.metaslab.preload_limit          3
> >>         vfs.zfs.metaslab.unload_delay           8
> >>         vfs.zfs.metaslab.load_pct               50
> >>         vfs.zfs.metaslab.min_alloc_size         33554432
> >>         vfs.zfs.metaslab.df_free_pct            4
> >>         vfs.zfs.metaslab.df_alloc_threshold     131072
> >>         vfs.zfs.metaslab.debug_unload           0
> >>         vfs.zfs.metaslab.debug_load             0
> >>         vfs.zfs.metaslab.fragmentation_threshold70
> >>         vfs.zfs.metaslab.force_ganging          16777217
> >>         vfs.zfs.free_bpobj_enabled              1
> >>         vfs.zfs.free_max_blocks                 -1
> >>         vfs.zfs.zfs_scan_checkpoint_interval    7200
> >>         vfs.zfs.zfs_scan_legacy                 0
> >>         vfs.zfs.no_scrub_prefetch               0
> >>         vfs.zfs.no_scrub_io                     0
> >>         vfs.zfs.resilver_min_time_ms            3000
> >>         vfs.zfs.free_min_time_ms                1000
> >>         vfs.zfs.scan_min_time_ms                1000
> >>         vfs.zfs.scan_idle                       50
> >>         vfs.zfs.scrub_delay                     4
> >>         vfs.zfs.resilver_delay                  2
> >>         vfs.zfs.zfetch.array_rd_sz              1048576
> >>         vfs.zfs.zfetch.max_idistance            67108864
> >>         vfs.zfs.zfetch.max_distance             8388608
> >>         vfs.zfs.zfetch.min_sec_reap             2
> >>         vfs.zfs.zfetch.max_streams              8
> >>         vfs.zfs.prefetch_disable                0
> >>         vfs.zfs.delay_scale                     500000
> >>         vfs.zfs.delay_min_dirty_percent         60
> >>         vfs.zfs.dirty_data_sync_pct             20
> >>         vfs.zfs.dirty_data_max_percent          10
> >>         vfs.zfs.dirty_data_max_max              4294967296
> >>         vfs.zfs.dirty_data_max                  4294967296
> >>         vfs.zfs.max_recordsize                  1048576
> >>         vfs.zfs.default_ibs                     17
> >>         vfs.zfs.default_bs                      9
> >>         vfs.zfs.send_holes_without_birth_time   1
> >>         vfs.zfs.mdcomp_disable                  0
> >>         vfs.zfs.per_txg_dirty_frees_percent     5
> >>         vfs.zfs.nopwrite_enabled                1
> >>         vfs.zfs.dedup.prefetch                  1
> >>         vfs.zfs.dbuf_cache_lowater_pct          10
> >>         vfs.zfs.dbuf_cache_hiwater_pct          10
> >>         vfs.zfs.dbuf_metadata_cache_overflow    0
> >>         vfs.zfs.dbuf_metadata_cache_shift       6
> >>         vfs.zfs.dbuf_cache_shift                5
> >>         vfs.zfs.dbuf_metadata_cache_max_bytes   1025282816
> >>         vfs.zfs.dbuf_cache_max_bytes            2050565632
> >>         vfs.zfs.arc_min_prescient_prefetch_ms   6
> >>         vfs.zfs.arc_min_prefetch_ms             1
> >>         vfs.zfs.l2c_only_size                   0
> >>         vfs.zfs.mfu_ghost_data_esize            7778263552
> >>         vfs.zfs.mfu_ghost_metadata_esize        16851792896
> >>         vfs.zfs.mfu_ghost_size                  24630056448
> >>         vfs.zfs.mfu_data_esize                  3059418112
> >>         vfs.zfs.mfu_metadata_esize              28641792
> >>         vfs.zfs.mfu_size                        6399023104
> >>         vfs.zfs.mru_ghost_data_esize            2199812096
> >>         vfs.zfs.mru_ghost_metadata_esize        6289682432
> >>         vfs.zfs.mru_ghost_size                  8489494528
> >>         vfs.zfs.mru_data_esize                  22781456384
> >>         vfs.zfs.mru_metadata_esize              309155840
> >>         vfs.zfs.mru_size                        23847875584
> >>         vfs.zfs.anon_data_esize                 0
> >>         vfs.zfs.anon_metadata_esize             0
> >>         vfs.zfs.anon_size                       8556544
> >>         vfs.zfs.l2arc_norw                      1
> >>         vfs.zfs.l2arc_feed_again                1
> >>         vfs.zfs.l2arc_noprefetch                1
> >>         vfs.zfs.l2arc_feed_min_ms               200
> >>         vfs.zfs.l2arc_feed_secs                 1
> >>         vfs.zfs.l2arc_headroom                  2
> >>         vfs.zfs.l2arc_write_boost               8388608
> >>         vfs.zfs.l2arc_write_max                 8388608
> >>         vfs.zfs.arc_meta_strategy               1
> >>         vfs.zfs.arc_meta_limit                  15833624576
> >>         vfs.zfs.arc_free_target                 346902
> >>         vfs.zfs.arc_kmem_cache_reap_retry_ms    1000
> >>         vfs.zfs.compressed_arc_enabled          1
> >>         vfs.zfs.arc_grow_retry                  60
> >>         vfs.zfs.arc_shrink_shift                7
> >>         vfs.zfs.arc_average_blocksize           8192
> >>         vfs.zfs.arc_no_grow_shift               5
> >>         vfs.zfs.arc_min                         8202262528
> >>         vfs.zfs.arc_max                         39334498304
> >>         vfs.zfs.abd_chunk_size                  4096
> >>         vfs.zfs.abd_scatter_enabled             1
> >>
> >> _______________________________________________
> >> freebsd-stable at freebsd.org mailing list
> >> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> >> To unsubscribe, send any mail to
> >> "freebsd-stable-unsubscribe at freebsd.org"
> >>
> >>
> >>
> > _______________________________________________
> > freebsd-stable at freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
> >
> 


More information about the freebsd-stable mailing list