Disk/ZFS activity crash on 11.2-STABLE [SOLVED]

Jim Long list at museum.rain.com
Fri Jul 13 21:50:52 UTC 2018

On Fri, Jul 13, 2018 at 03:22:39PM -0400, Mike Tancsa wrote:
> If you ever have a system with a LOT of small files and directories, a
> handy value to tune / keep an eye on is the mix allocated to metadata vs
> file data. vfs.zfs.arc_meta_limit.  You can tell when doing things like
> "ls" in a directory takes a LONG time to list files. In my case, I had
> many directories with 50,000+ files.
> Also things like 'zfs list -t snapshot' start to take a long time.

I think I already have that symptom on another new server, a backup
retention server.  It's slow (CPU) and fat (disk).

# time zfs list -Hrt snap | wc -l

real    2m47.811s
user    0m1.757s
sys     0m20.828s

Almost three minutes to list all snapshots found.  So when that
symptomatic slowness appears, is the tweak to *raise* arc_meta_limit ?
I don't immediately see how to tell what the arc_meta usage is, and thus
see how close it is to the limit.

>From that storage server ("electron"):

ARC Efficiency:                                 20.18m
        Cache Hit Ratio:                91.20%  18.40m
        Cache Miss Ratio:               8.80%   1.78m
        Actual Hit Ratio:               91.18%  18.40m

        Data Demand Efficiency:         87.10%  11.95k

          Anonymously Used:             0.02%   3.15k
          Most Recently Used:           0.30%   55.95k
          Most Frequently Used:         99.68%  18.34m
          Most Recently Used Ghost:     0.00%   0
          Most Frequently Used Ghost:   0.00%   0

          Demand Data:                  0.06%   10.41k
          Prefetch Data:                0.00%   0
          Demand Metadata:              99.93%  18.39m
          Prefetch Metadata:            0.02%   3.15k

          Demand Data:                  0.09%   1.54k
          Prefetch Data:                0.00%   0
          Demand Metadata:              99.73%  1.77m
          Prefetch Metadata:            0.19%   3.30k

# sysctl -a | grep arc | grep ^vfs.zfs
vfs.zfs.l2arc_norw: 1
vfs.zfs.l2arc_feed_again: 1
vfs.zfs.l2arc_noprefetch: 1
vfs.zfs.l2arc_feed_min_ms: 200
vfs.zfs.l2arc_feed_secs: 1
vfs.zfs.l2arc_headroom: 2
vfs.zfs.l2arc_write_boost: 8388608
vfs.zfs.l2arc_write_max: 8388608
vfs.zfs.arc_meta_limit: 16432737280
vfs.zfs.arc_free_target: 113124
vfs.zfs.compressed_arc_enabled: 1
vfs.zfs.arc_grow_retry: 60
vfs.zfs.arc_shrink_shift: 7
vfs.zfs.arc_average_blocksize: 8192
vfs.zfs.arc_no_grow_shift: 5
vfs.zfs.arc_min: 8216368640
vfs.zfs.arc_max: 65730949120

# top | head -8
last pid:   943;  load averages:  0.14,  0.15,  0.10  up 0+00:30:43    14:42:26
22 processes:  1 running, 21 sleeping

Mem: 16M Active, 13M Inact, 1063M Wired, 61G Free
ARC: 374M Total, 214M MFU, 63M MRU, 32K Anon, 5614K Header, 91M Other
     79M Compressed, 223M Uncompressed, 2.81:1 Ratio
Swap: 8192M Total, 8192M Free

Thanks again, Mike.


