advice needed: zpool of 10 x (raidz2 on (4+2) x 2T HDD)
Zeus Panchenko
zeus at ibs.dn.ua
Wed Dec 2 11:38:56 UTC 2015
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
greetings,
we deployed storage, and as it was filling until now, I see I need
an advice regarding the configuration and optimization/s ...
the main cause I decided to ask for an advice is this:
once per month (or even more frequently, depends on the load I
suggest) host hangs and only power reset helps, nothing helpful in log
files though ... just the fact of restart logged and usual ctld activity
after reboot, `zpool import' lasts 40min and more, and during this time
no resource of the host is used much ... neither CPU nor memory ... top
and systat shows no load (I need to export pool first since I need to
attach geli first, and if I attach geli with zpool still imported, I
receive in the end a lot of "absent/damaged" disks in zpool which
disappears after export/import)
so, I'm wondering what can I do to trace the cause of hangs? what to monitore to
understand what to expect and how to prevent ...
so, please, advise
- ----------------------------------------------------------------------------------
bellow the details are:
- ----------------------------------------------------------------------------------
the box is Supermicro X9DRD-7LN4F with:
CPU: Intel(R) Xeon(R) CPU E5-2630L (2 package(s) x 6 core(s) x 2 SMT threads)
RAM: 128Gb
STOR: 3 x LSI SAS2308 PCI-Express Fusion-MPT SAS-2 (jbod)
60 x HDD 2T (ATA WDC WD20EFRX-68A 0A80, Fixed Direct Access SCSI-6 device 600.000MB/s)
OS: FreeBSD 10.1-RELEASE #0 r274401 amd64
to avoid OS memory shortage sysctl vfs.zfs.arc_max is set to 120275861504
to clients, storage is provided via iSCSI by ctld (each target is file backed)
zpool created of 10 x raidz2, each raidz2 consists of 6 geli devices and
now looks so (yes, deduplication is on):
> zpool list storage
NAME SIZE ALLOC FREE FRAG EXPANDSZ CAP DEDUP HEALTH ALTROOT
storage 109T 33.5T 75.2T - - 30% 1.57x ONLINE -
> zpool history storage
2013-10-21.01:31:14 zpool create storage
raidz2 gpt/c0s00 gpt/c0s01 gpt/c1s00 gpt/c1s01 gpt/c2s00 gpt/c2s01
raidz2 gpt/c0s02 gpt/c0s03 gpt/c1s02 gpt/c1s03 gpt/c2s02 gpt/c2s03
...
raidz2 gpt/c0s18 gpt/c0s19 gpt/c1s18 gpt/c1s19 gpt/c2s18 gpt/c2s19
log mirror gpt/log0 gpt/log1
cache gpt/cache0 gpt/cache1
> zdb storage
Cached configuration:
version: 5000
name: 'storage'
state: 0
txg: 13340514
pool_guid: 11994995707440773547
hostid: 1519855013
hostname: 'storage.foo.bar'
vdev_children: 11
vdev_tree:
type: 'root'
id: 0
guid: 11994995707440773547
children[0]:
type: 'raidz'
id: 0
guid: 12290021428260525074
nparity: 2
metaslab_array: 46
metaslab_shift: 36
ashift: 12
asize: 12002364751872
is_log: 0
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 3897093815971447961
path: '/dev/gpt/c0s00'
phys_path: '/dev/gpt/c0s00'
whole_disk: 1
DTL: 9133
create_txg: 4
children[1]:
type: 'disk'
id: 1
guid: 1036685341766239763
path: '/dev/gpt/c0s01'
phys_path: '/dev/gpt/c0s01'
whole_disk: 1
DTL: 9132
create_txg: 4
...
each geli is created on one HDD
> geli list da50.eli
Geom name: da50.eli
State: ACTIVE
EncryptionAlgorithm: AES-XTS
KeyLength: 256
Crypto: hardware
Version: 6
UsedKey: 0
Flags: (null)
KeysAllocated: 466
KeysTotal: 466
Providers:
1. Name: da50.eli
Mediasize: 2000398929920 (1.8T)
Sectorsize: 4096
Mode: r1w1e3
Consumers:
1. Name: da50
Mediasize: 2000398934016 (1.8T)
Sectorsize: 512
Stripesize: 4096
Stripeoffset: 0
Mode: r1w1e1
each raidz2 disk configured as:
> gpart show da50.eli
=> 6 488378634 da50.eli GPT (1.8T)
6 488378634 1 freebsd-zfs (1.8T)
> zfs-stats -a
- --------------------------------------------------------------------------
ZFS Subsystem Report Wed Dec 2 09:59:27 2015
- --------------------------------------------------------------------------
System Information:
Kernel Version: 1001000 (osreldate)
Hardware Platform: amd64
Processor Architecture: amd64
FreeBSD 10.1-RELEASE #0 r274401: Tue Nov 11 21:02:49 UTC 2014 root
9:59AM up 1 day, 46 mins, 10 users, load averages: 1.03, 0.46, 0.75
- --------------------------------------------------------------------------
System Memory Statistics:
Physical Memory: 131012.88M
Kernel Memory: 1915.37M
DATA: 98.62% 1888.90M
TEXT: 1.38% 26.47M
- --------------------------------------------------------------------------
ZFS pool information:
Storage pool Version (spa): 5000
Filesystem Version (zpl): 5
- --------------------------------------------------------------------------
ARC Misc:
Deleted: 1961248
Recycle Misses: 127014
Mutex Misses: 5973
Evict Skips: 5973
ARC Size:
Current Size (arcsize): 100.00% 114703.88M
Target Size (Adaptive, c): 100.00% 114704.00M
Min Size (Hard Limit, c_min): 12.50% 14338.00M
Max Size (High Water, c_max): ~8:1 114704.00M
ARC Size Breakdown:
Recently Used Cache Size (p): 93.75% 107535.69M
Freq. Used Cache Size (c-p): 6.25% 7168.31M
ARC Hash Breakdown:
Elements Max: 6746532
Elements Current: 100.00% 6746313
Collisions: 9651654
Chain Max: 0
Chains: 1050203
ARC Eviction Statistics:
Evicts Total: 194298918912
Evicts Eligible for L2: 81.00% 157373345280
Evicts Ineligible for L2: 19.00% 36925573632
Evicts Cached to L2: 97939090944
ARC Efficiency
Cache Access Total: 109810376
Cache Hit Ratio: 91.57% 100555148
Cache Miss Ratio: 8.43% 9255228
Actual Hit Ratio: 90.54% 99423922
Data Demand Efficiency: 76.64%
Data Prefetch Efficiency: 48.46%
CACHE HITS BY CACHE LIST:
Anonymously Used: 0.88% 881966
Most Recently Used (mru): 23.11% 23236902
Most Frequently Used (mfu): 75.77% 76187020
MRU Ghost (mru_ghost): 0.03% 26449
MFU Ghost (mfu_ghost): 0.22% 222811
CACHE HITS BY DATA TYPE:
Demand Data: 10.17% 10227867
Prefetch Data: 0.45% 455126
Demand Metadata: 88.69% 89184329
Prefetch Metadata: 0.68% 687826
CACHE MISSES BY DATA TYPE:
Demand Data: 33.69% 3117808
Prefetch Data: 5.23% 484140
Demand Metadata: 56.55% 5233984
Prefetch Metadata: 4.53% 419296
- --------------------------------------------------------------------------
L2 ARC Summary:
Low Memory Aborts: 77
R/W Clashes: 13
Free on Write: 523
L2 ARC Size:
Current Size: (Adaptive) 91988.13M
Header Size: 0.13% 120.08M
L2 ARC Read/Write Activity:
Bytes Written: 97783.99M
Bytes Read: 2464.81M
L2 ARC Breakdown:
Access Total: 8110124
Hit Ratio: 2.89% 234616
Miss Ratio: 97.11% 7875508
Feeds: 85129
WRITES:
Sent Total: 100.00% 18448
- --------------------------------------------------------------------------
VDEV Cache Summary:
Access Total: 0
Hits Ratio: 0.00% 0
Miss Ratio: 0.00% 0
Delegations: 0
- --------------------------------------------------------------------------
File-Level Prefetch Stats (DMU):
DMU Efficiency:
Access Total: 162279162
Hit Ratio: 91.69% 148788486
Miss Ratio: 8.31% 13490676
Colinear Access Total: 13490676
Colinear Hit Ratio: 0.06% 8166
Colinear Miss Ratio: 99.94% 13482510
Stride Access Total: 146863482
Stride Hit Ratio: 99.31% 145846806
Stride Miss Ratio: 0.69% 1016676
DMU misc:
Reclaim successes: 124372
Reclaim failures: 13358138
Stream resets: 618
Stream noresets: 2938602
Bogus streams: 0
- --------------------------------------------------------------------------
ZFS Tunable (sysctl):
kern.maxusers=8524
vfs.zfs.arc_max=120275861504
vfs.zfs.arc_min=15034482688
vfs.zfs.arc_average_blocksize=8192
vfs.zfs.arc_meta_used=24838283936
vfs.zfs.arc_meta_limit=30068965376
vfs.zfs.l2arc_write_max=8388608
vfs.zfs.l2arc_write_boost=8388608
vfs.zfs.l2arc_headroom=2
vfs.zfs.l2arc_feed_secs=1
vfs.zfs.l2arc_feed_min_ms=200
vfs.zfs.l2arc_noprefetch=1
vfs.zfs.l2arc_feed_again=1
vfs.zfs.l2arc_norw=1
vfs.zfs.anon_size=27974656
vfs.zfs.anon_metadata_lsize=0
vfs.zfs.anon_data_lsize=0
vfs.zfs.mru_size=112732930560
vfs.zfs.mru_metadata_lsize=18147921408
vfs.zfs.mru_data_lsize=92690379776
vfs.zfs.mru_ghost_size=7542758400
vfs.zfs.mru_ghost_metadata_lsize=1262705664
vfs.zfs.mru_ghost_data_lsize=6280052736
vfs.zfs.mfu_size=3748620800
vfs.zfs.mfu_metadata_lsize=1014886912
vfs.zfs.mfu_data_lsize=2723481600
vfs.zfs.mfu_ghost_size=24582345728
vfs.zfs.mfu_ghost_metadata_lsize=682512384
vfs.zfs.mfu_ghost_data_lsize=23899833344
vfs.zfs.l2c_only_size=66548531200
vfs.zfs.dedup.prefetch=1
vfs.zfs.nopwrite_enabled=1
vfs.zfs.mdcomp_disable=0
vfs.zfs.dirty_data_max=4294967296
vfs.zfs.dirty_data_max_max=4294967296
vfs.zfs.dirty_data_max_percent=10
vfs.zfs.dirty_data_sync=67108864
vfs.zfs.delay_min_dirty_percent=60
vfs.zfs.delay_scale=500000
vfs.zfs.prefetch_disable=0
vfs.zfs.zfetch.max_streams=8
vfs.zfs.zfetch.min_sec_reap=2
vfs.zfs.zfetch.block_cap=256
vfs.zfs.zfetch.array_rd_sz=1048576
vfs.zfs.top_maxinflight=32
vfs.zfs.resilver_delay=2
vfs.zfs.scrub_delay=4
vfs.zfs.scan_idle=50
vfs.zfs.scan_min_time_ms=1000
vfs.zfs.free_min_time_ms=1000
vfs.zfs.resilver_min_time_ms=3000
vfs.zfs.no_scrub_io=0
vfs.zfs.no_scrub_prefetch=0
vfs.zfs.metaslab.gang_bang=131073
vfs.zfs.metaslab.fragmentation_threshold=70
vfs.zfs.metaslab.debug_load=0
vfs.zfs.metaslab.debug_unload=0
vfs.zfs.metaslab.df_alloc_threshold=131072
vfs.zfs.metaslab.df_free_pct=4
vfs.zfs.metaslab.min_alloc_size=10485760
vfs.zfs.metaslab.load_pct=50
vfs.zfs.metaslab.unload_delay=8
vfs.zfs.metaslab.preload_limit=3
vfs.zfs.metaslab.preload_enabled=1
vfs.zfs.metaslab.fragmentation_factor_enabled=1
vfs.zfs.metaslab.lba_weighting_enabled=1
vfs.zfs.metaslab.bias_enabled=1
vfs.zfs.condense_pct=200
vfs.zfs.mg_noalloc_threshold=0
vfs.zfs.mg_fragmentation_threshold=85
vfs.zfs.check_hostid=1
vfs.zfs.spa_load_verify_maxinflight=10000
vfs.zfs.spa_load_verify_metadata=1
vfs.zfs.spa_load_verify_data=1
vfs.zfs.recover=0
vfs.zfs.deadman_synctime_ms=1000000
vfs.zfs.deadman_checktime_ms=5000
vfs.zfs.deadman_enabled=1
vfs.zfs.spa_asize_inflation=24
vfs.zfs.txg.timeout=5
vfs.zfs.vdev.cache.max=16384
vfs.zfs.vdev.cache.size=0
vfs.zfs.vdev.cache.bshift=16
vfs.zfs.vdev.trim_on_init=1
vfs.zfs.vdev.mirror.rotating_inc=0
vfs.zfs.vdev.mirror.rotating_seek_inc=5
vfs.zfs.vdev.mirror.rotating_seek_offset=1048576
vfs.zfs.vdev.mirror.non_rotating_inc=0
vfs.zfs.vdev.mirror.non_rotating_seek_inc=1
vfs.zfs.vdev.max_active=1000
vfs.zfs.vdev.sync_read_min_active=10
vfs.zfs.vdev.sync_read_max_active=10
vfs.zfs.vdev.sync_write_min_active=10
vfs.zfs.vdev.sync_write_max_active=10
vfs.zfs.vdev.async_read_min_active=1
vfs.zfs.vdev.async_read_max_active=3
vfs.zfs.vdev.async_write_min_active=1
vfs.zfs.vdev.async_write_max_active=10
vfs.zfs.vdev.scrub_min_active=1
vfs.zfs.vdev.scrub_max_active=2
vfs.zfs.vdev.trim_min_active=1
vfs.zfs.vdev.trim_max_active=64
vfs.zfs.vdev.aggregation_limit=131072
vfs.zfs.vdev.read_gap_limit=32768
vfs.zfs.vdev.write_gap_limit=4096
vfs.zfs.vdev.bio_flush_disable=0
vfs.zfs.vdev.bio_delete_disable=0
vfs.zfs.vdev.trim_max_bytes=2147483648
vfs.zfs.vdev.trim_max_pending=64
vfs.zfs.max_auto_ashift=13
vfs.zfs.min_auto_ashift=9
vfs.zfs.zil_replay_disable=0
vfs.zfs.cache_flush_disable=0
vfs.zfs.zio.use_uma=1
vfs.zfs.zio.exclude_metadata=0
vfs.zfs.sync_pass_deferred_free=2
vfs.zfs.sync_pass_dont_compress=5
vfs.zfs.sync_pass_rewrite=2
vfs.zfs.snapshot_list_prefetch=0
vfs.zfs.super_owner=0
vfs.zfs.debug=0
vfs.zfs.version.ioctl=4
vfs.zfs.version.acl=1
vfs.zfs.version.spa=5000
vfs.zfs.version.zpl=5
vfs.zfs.vol.mode=1
vfs.zfs.trim.enabled=1
vfs.zfs.trim.txg_delay=32
vfs.zfs.trim.timeout=30
vfs.zfs.trim.max_interval=1
vm.kmem_size=133823901696
vm.kmem_size_scale=1
vm.kmem_size_min=0
vm.kmem_size_max=1319413950874
- --
Zeus V. Panchenko jid:zeus at im.ibs.dn.ua
IT Dpt., I.B.S. LLC GMT+2 (EET)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iEYEARECAAYFAlZe10QACgkQr3jpPg/3oyqVAwCdHeRra+H9ac/+HCiQ80DhthlZ
SSUAnjucvvosNjcUzTqKgGe+LlLctaoV
=WPge
-----END PGP SIGNATURE-----
More information about the freebsd-fs
mailing list