ZFS stats in "top" -- ZFS performance started being crappy in spurts
Chad Leigh - Pengar LLC
chad at pengar.com
Sat Aug 11 23:33:29 UTC 2012
Hi
I have a FreeBSD 9 system with ZFS root. It is actually a VM under Xen on a beefy piece of HW (4 core Sandy Bridge 3ghz Xeon, total HW memory 32GB -- VM has 4vcpus and 6GB RAM). Mirrored gpart partitions. I am looking for data integrity more than performance as long as performance is reasonable (which it has more than been the last 3 months).
The other "servers" on the same HW, the other VMs on the same, don't have this problem but are set up the same way. There are 4 other FreeBSD VMs, one running email for a one man company and a few of his friends, as well as some static web pages and stuff for him, one runs a few low use web apps for various customers, and one runs about 30 websites with apache and nginx, mostly just static sites. None are heavily used. There is also one VM with linux running a couple low use FrontBase databases. Not high use database -- low use ones.
The troubleseome VM has been running fine for over 3 months since I installed it. Level of use has been pretty much constant. The server runs 4 jails on it, each dedicated to a different bit of email processing for a small number of users. One is a secondary DNS. One runs clamav and spamassassin. One runs exim for incoming and outgoing mail. One runs dovecot for imap and pop. There is no web server or database or anything else running.
Total number of mail users on the system is approximately 50, plus or minus. Total mail traffic is very low compared to "real" mail servers.
Earlier this week things started "freezing up". It might last a few minutes, or it might last 1/2 hour. Processes become unresponsive. This can last a few minutes or much longer. It eventually resolves itself and things are good for another 10 minutes or 3 hours until it happens again. When it happens, lots of processes are listed in "top" as
zfs
zio->i
zfs
tx->tx
db->db
state. These processes only get listed in these states when there are problems. What are these states indicative of?
Eventually things get going again, these states drop off and the system hums along.
Based on some stuff I found in Google (for a person who had a different but somewhat similar problem) I tried setting
zfs set primarycache=metadata zroot
and
zfs set primarycache=none zroot
but the problem still happened with approximately the same severity and frequency. (Wanted to see if the system was "churning" with cache upkeep).
What is strange is that this server ran fine for 3 months straight without interruption with the same level of work.
Thanks for any hints or clues
Chad
some data points below
---
# uname -a
FreeBSD newbagend 9.0-STABLE FreeBSD 9.0-STABLE #1: Wed Mar 21 15:22:14 MDT 2012 chad at underhill:/usr/obj/usr/src/sys/UNDERHILL-XEN amd64
#
---
# zpool status
pool: zroot
state: ONLINE
scan: scrub repaired 0 in 6h13m with 0 errors on Fri Aug 10 19:33:23 2012
config:
NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/f0da8263-8a52-11e1-b3ae-aa00003efccd ONLINE 0 0 0
gptid/0f24ab58-8a53-11e1-b3ae-aa00003efccd ONLINE 0 0 0
errors: No known data errors
#
---
representative data from doing a stats during a trouble period
zfs-stats -a
------------------------------------------------------------------------
ZFS Subsystem Report Sat Aug 11 13:40:07 2012
------------------------------------------------------------------------
System Information:
Kernel Version: 900505 (osreldate)
Hardware Platform: amd64
Processor Architecture: amd64
ZFS Storage pool Version: 28
ZFS Filesystem Version: 5
FreeBSD 9.0-STABLE #1: Wed Mar 21 15:22:14 MDT 2012 chad
1:40PM up 2:54, 3 users, load averages: 0.23, 0.19, 0.14
------------------------------------------------------------------------
System Memory:
11.49% 681.92 MiB Active, 4.03% 238.97 MiB Inact
33.37% 1.93 GiB Wired, 0.05% 3.04 MiB Cache
51.04% 2.96 GiB Free, 0.01% 808.00 KiB Gap
Real Installed: 6.00 GiB
Real Available: 99.65% 5.98 GiB
Real Managed: 96.93% 5.80 GiB
Logical Total: 6.00 GiB
Logical Used: 46.76% 2.81 GiB
Logical Free: 53.24% 3.19 GiB
Kernel Memory: 1.25 GiB
Data: 98.38% 1.23 GiB
Text: 1.62% 20.75 MiB
Kernel Memory Map: 5.68 GiB
Size: 17.27% 1003.75 MiB
Free: 82.73% 4.70 GiB
------------------------------------------------------------------------
ARC Summary: (HEALTHY)
Memory Throttle Count: 0
ARC Misc:
Deleted: 9
Recycle Misses: 64.30k
Mutex Misses: 10
Evict Skips: 58.80k
ARC Size: 39.98% 1.20 GiB
Target Size: (Adaptive) 100.00% 3.00 GiB
Min Size (Hard Limit): 12.50% 384.00 MiB
Max Size (High Water): 8:1 3.00 GiB
ARC Size Breakdown:
Recently Used Cache Size: 25.56% 785.15 MiB
Frequently Used Cache Size: 74.44% 2.23 GiB
ARC Hash Breakdown:
Elements Max: 223.30k
Elements Current: 99.93% 223.15k
Collisions: 418.23k
Chain Max: 9
Chains: 66.67k
------------------------------------------------------------------------
ARC Efficiency: 3.17m
Cache Hit Ratio: 89.07% 2.82m
Cache Miss Ratio: 10.93% 346.27k
Actual Hit Ratio: 86.49% 2.74m
Data Demand Efficiency: 99.50% 1.09m
Data Prefetch Efficiency: 60.54% 1.78k
CACHE HITS BY CACHE LIST:
Most Recently Used: 23.72% 669.34k
Most Frequently Used: 73.38% 2.07m
Most Recently Used Ghost: 1.92% 54.33k
Most Frequently Used Ghost: 3.30% 93.02k
CACHE HITS BY DATA TYPE:
Demand Data: 38.35% 1.08m
Prefetch Data: 0.04% 1.08k
Demand Metadata: 58.75% 1.66m
Prefetch Metadata: 2.87% 80.97k
CACHE MISSES BY DATA TYPE:
Demand Data: 1.56% 5.39k
Prefetch Data: 0.20% 704
Demand Metadata: 55.46% 192.02k
Prefetch Metadata: 42.78% 148.15k
------------------------------------------------------------------------
L2ARC is disabled
------------------------------------------------------------------------
File-Level Prefetch: (HEALTHY)
DMU Efficiency: 6.05m
Hit Ratio: 66.59% 4.03m
Miss Ratio: 33.41% 2.02m
Colinear: 2.02m
Hit Ratio: 0.04% 725
Miss Ratio: 99.96% 2.02m
Stride: 3.90m
Hit Ratio: 99.98% 3.90m
Miss Ratio: 0.02% 826
DMU Misc:
Reclaim: 2.02m
Successes: 2.02% 40.86k
Failures: 97.98% 1.98m
Streams: 125.81k
+Resets: 0.36% 453
-Resets: 99.64% 125.36k
Bogus: 0
------------------------------------------------------------------------
VDEV Cache Summary: 530.68k
Hit Ratio: 15.30% 81.21k
Miss Ratio: 70.40% 373.57k
Delegations: 14.30% 75.89k
------------------------------------------------------------------------
ZFS Tunables (sysctl):
kern.maxusers 512
vm.kmem_size 6222712832
vm.kmem_size_scale 1
vm.kmem_size_min 0
vm.kmem_size_max 329853485875
vfs.zfs.l2c_only_size 0
vfs.zfs.mfu_ghost_data_lsize 91367424
vfs.zfs.mfu_ghost_metadata_lsize 128350208
vfs.zfs.mfu_ghost_size 219717632
vfs.zfs.mfu_data_lsize 132299264
vfs.zfs.mfu_metadata_lsize 20034048
vfs.zfs.mfu_size 160949760
vfs.zfs.mru_ghost_data_lsize 45155328
vfs.zfs.mru_ghost_metadata_lsize 642998784
vfs.zfs.mru_ghost_size 688154112
vfs.zfs.mru_data_lsize 347115520
vfs.zfs.mru_metadata_lsize 10907136
vfs.zfs.mru_size 794174976
vfs.zfs.anon_data_lsize 0
vfs.zfs.anon_metadata_lsize 0
vfs.zfs.anon_size 29469696
vfs.zfs.l2arc_norw 1
vfs.zfs.l2arc_feed_again 1
vfs.zfs.l2arc_noprefetch 1
vfs.zfs.l2arc_feed_min_ms 200
vfs.zfs.l2arc_feed_secs 1
vfs.zfs.l2arc_headroom 2
vfs.zfs.l2arc_write_boost 8388608
vfs.zfs.l2arc_write_max 8388608
vfs.zfs.arc_meta_limit 805306368
vfs.zfs.arc_meta_used 805310296
vfs.zfs.arc_min 402653184
vfs.zfs.arc_max 3221225472
vfs.zfs.dedup.prefetch 1
vfs.zfs.mdcomp_disable 0
vfs.zfs.write_limit_override 0
vfs.zfs.write_limit_inflated 19260174336
vfs.zfs.write_limit_max 802507264
vfs.zfs.write_limit_min 33554432
vfs.zfs.write_limit_shift 3
vfs.zfs.no_write_throttle 0
vfs.zfs.zfetch.array_rd_sz 1048576
vfs.zfs.zfetch.block_cap 256
vfs.zfs.zfetch.min_sec_reap 2
vfs.zfs.zfetch.max_streams 8
vfs.zfs.prefetch_disable 0
vfs.zfs.mg_alloc_failures 8
vfs.zfs.check_hostid 1
vfs.zfs.recover 0
vfs.zfs.txg.synctime_ms 1000
vfs.zfs.txg.timeout 5
vfs.zfs.scrub_limit 10
vfs.zfs.vdev.cache.bshift 16
vfs.zfs.vdev.cache.size 10485760
vfs.zfs.vdev.cache.max 16384
vfs.zfs.vdev.write_gap_limit 4096
vfs.zfs.vdev.read_gap_limit 32768
vfs.zfs.vdev.aggregation_limit 131072
vfs.zfs.vdev.ramp_rate 2
vfs.zfs.vdev.time_shift 6
vfs.zfs.vdev.min_pending 4
vfs.zfs.vdev.max_pending 10
vfs.zfs.vdev.bio_flush_disable 0
vfs.zfs.cache_flush_disable 0
vfs.zfs.zil_replay_disable 0
vfs.zfs.zio.use_uma 0
vfs.zfs.snapshot_list_prefetch 0
vfs.zfs.version.zpl 5
vfs.zfs.version.spa 28
vfs.zfs.version.acl 1
vfs.zfs.debug 0
vfs.zfs.super_owner 0
------------------------
representative (from during a trouble period -- you see not much is going on -- low load and the iostat during a calm good period is about the same)
zpool iostat zroot 1
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
zroot 107G 41.9G 7 261 23.8K 1.52M
zroot 107G 41.9G 10 140 7.42K 272K
zroot 107G 41.9G 8 176 14.4K 547K
zroot 107G 41.9G 0 59 0 188K
zroot 107G 41.9G 5 171 6.44K 1.73M
zroot 107G 41.9G 4 284 8.42K 1006K
zroot 107G 41.9G 5 118 2.97K 260K
zroot 107G 41.9G 25 194 27.7K 623K
zroot 107G 41.9G 0 132 0 764K
zroot 107G 41.9G 1 95 6.44K 1.16M
zroot 107G 41.9G 8 272 16.3K 829K
zroot 107G 41.9G 56 212 103K 213K
zroot 107G 41.9G 22 221 27.7K 204K
zroot 107G 41.9G 2 455 1.48K 509K
zroot 107G 41.9G 14 198 7.42K 132K
zroot 107G 41.9G 14 270 7.42K 306K
zroot 107G 41.9G 6 273 3.46K 670K
zroot 107G 41.9G 21 175 10.9K 570K
zroot 107G 41.9G 17 179 8.91K 591K
zroot 107G 41.9G 11 289 17.3K 902K
zroot 107G 41.9G 13 121 6.93K 230K
zroot 107G 41.9G 18 238 9.41K 734K
zroot 107G 41.9G 99 61 50.5K 188K
zroot 107G 41.9G 0 222 0 862K
zroot 107G 41.9G 11 149 13.4K 1.12M
zroot 107G 41.9G 15 319 10.9K 1.05M
zroot 107G 41.9G 0 127 0 392K
zroot 107G 41.9G 0 159 0 1.70M
zroot 107G 41.9G 68 196 212K 601K
zroot 107G 41.9G 17 144 18.8K 295K
zroot 107G 41.9G 12 187 17.3K 588K
zroot 107G 41.9G 0 136 0 1.23M
zroot 107G 41.9G 6 209 23.8K 564K
zroot 107G 41.9G 11 199 12.4K 422K
zroot 107G 41.9G 12 178 9.41K 553K
zroot 107G 41.9G 0 140 1.48K 1.17M
zroot 107G 41.9G 48 200 128K 411K
zroot 107G 41.9G 8 191 16.8K 121K
zroot 107G 41.9G 1 397 1013 375K
zroot 107G 41.9G 0 263 0 132K
zroot 107G 41.9G 14 228 13.4K 235K
zroot 107G 41.9G 7 21 4.46K 10.9K
zroot 107G 41.9G 2 161 1.48K 156K
More information about the freebsd-questions
mailing list