ZFS and mem management
Pavlo
devgs at ukr.net
Wed Feb 15 12:39:45 UTC 2012
Unfortunately we can't afford disabling prefetch. It is too much of an
overhead.
Also I made some tests. I have process that maps file using mmap() and
writes or reads first byte of each page of mapped file with some data.
There is 8Gb Ram on machine.
Test 1:
Tool maps 1.5GB file, writes first byte of each page with some random
data. Wired memory get filled (with file cache ?) virtual address size of
a process is 1.5Gb while RES is ~20mb. Not closing this tool I ask it to
write data to each page again: Now Active memory get filled while Wired
still of same size. Now I ask my next tool to allocate 6Gb of memory. It
gets 5.8Gb and hangs in pagefault, sleeps there for about 10 seconds and
gets killed out of swap. After 'memory eater' is killed I see 900mb of
memory still in Active and that matches the RES size of the first tool
(it been reduced).
I suppose 900mb is memory that had no time to be flushed to file and give
out free pages when allocator killed 'memory eater', however it had time
to squeeze 600mb RAM out of first tool.
But mostly I see 1.5Gb of Active RAM afterwards. What means even though
we have 1.5Gb of memory that can be easily flushed back to the file that
didn't happens (it always happens for Linux for example), 'memory eater'
just hangs in pfault and later gets killed. Sometime this happens even
after first tool is done it's job and unmaped file. 'Frozen' 1.5 Gb of
Active memory still here. I want to say it is actually reusable if I run
first tool again i.e. pages got recalimed.
Test 2:
Assuming the possibility of busyness of FS I will try only reading
operations using mmap();
Case 1: tool does 2 runs through mmaped memory reading first byte of each
page.
After second run RES size gets almost equal to virtual address size i.e.
almost every page was mapped into RAM. I use my 'memory eating' tool and
ask for 6Gb again. After short hang in pfault it gets what I asked. While
first tool's RES size is dramatically reduced. That's what I wanted.
Case 2: tool does 10+ runs through mmaped memory reading first byte of
each page.
First time I run 'memory eater' sometimes it gets killed as in test 1 and
sometimes it shares some pages.
I can't understand where to dig. When RAM contains pages that are being
only red it is not a problem to free them but sometimes it doesn't
happen. I repeat again, even though Linux is differ so much from FreeBSD
it always does 'right' thing: flushes pages and provide memory. Well at
least I believe that is right thing.
Thanks.
>
2012/2/15 Pavlo <dev gs at ukr.net>:
>
> Hey George,
>
> thanks for quick response.
>
> No, no dedup is used.
>
> zfs-stats -a :
>
> ------------------------------------------------------------------------
> ZFS Subsystem Report Wed Feb 15 12:26:18 2012
>
> ------------------------------------------------------------------------
>
> System Information:
>
> Kernel Version: 802516 (osreldate)
> Hardware Platform: amd64
> Processor Architecture: amd64
>
> ZFS Storage pool Version: 28
> ZFS Filesystem Version: 5
>
> FreeBSD 8.2-STABLE #12: Thu Feb 9 11:35:23 EET 2012 root
> 12:26PM up 2:29, 7 users, load averages: 0.02, 0.16, 0.16
>
> ------------------------------------------------------------------------
>
> System Memory:
>
> 19.78% 1.53 GiB Active, 0.95% 75.21 MiB Inact
> 36.64% 2.84 GiB Wired, 0.06% 4.83 MiB Cache
> 42.56% 3.30 GiB Free, 0.01% 696.00 KiB Gap
>
>
> Real Installed: 8.00 GiB
> Real Available: 99.84% 7.99 GiB
> Real Managed: 96.96% 7.74 GiB
>
> Logical Total: 8.00 GiB
> Logical Used: 57.82% 4.63 GiB
> Logical Free: 42.18% 3.37 GiB
>
> Kernel Memory: 2.43 GiB
> Data: 99.54% 2.42 GiB
> Text: 0.46% 11.50 MiB
>
> Kernel Memory Map: 3.16 GiB
> Size: 69.69% 2.20 GiB
> Free: 30.31% 979.48 MiB
>
> ------------------------------------------------------------------------
>
> ARC Summary: (THROTTLED)
> Memory Throttle Count: 3.82k
>
> ARC Misc:
> Deleted: 874.34k
> Recycle Misses: 376.12k
> Mutex Misses: 4.74k
> Evict Skips: 4.74k
>
> ARC Size: 68.53% 2.34 GiB
> Target Size: (Adaptive) 68.54% 2.34 GiB
> Min Size (Hard Limit): 12.50% 437.50 MiB
> Max Size (High Water): 8:1 3.42 GiB
>
> ARC Size Breakdown:
> Recently Used Cache Size: 92.95% 2.18 GiB
> Frequently Used Cache Size: 7.05% 169.01 MiB
>
> ARC Hash Breakdown:
> Elements Max: 229.96k
> Elements Current: 40.05% 92.10k
> Collisions: 705.52k
> Chain Max: 11
> Chains: 20.64k
>
> ------------------------------------------------------------------------
>
> ARC Efficiency: 7.96m
> Cache Hit Ratio: 84.92% 6.76m
> Cache Miss Ratio: 15.08% 1.20m
> Actual Hit Ratio: 76.29% 6.08m
>
> Data Demand Efficiency: 91.32% 4.99m
> Data Prefetch Efficiency: 19.57% 134.19k
>
> CACHE HITS BY CACHE LIST:
> Anonymously Used: 7.24% 489.41k
> Most Recently Used: 25.29% 1.71m
> Most Frequently Used: 64.54% 4.37m
> Most Recently Used Ghost: 1.42% 95.77k
> Most Frequently Used Ghost: 1.51% 102.33k
>
> CACHE HITS BY DATA TYPE:
> Demand Data: 67.42% 4.56m
> Prefetch Data: 0.39% 26.26k
> Demand Metadata: 22.41% 1.52m
> Prefetch Metadata: 9.78% 661.25k
>
> CACHE MISSES BY DATA TYPE:
> Demand Data: 36.11% 433.60k
> Prefetch Data: 8.99% 107.94k
> Demand Metadata: 32.00% 384.29k
> Prefetch Metadata: 22.91% 275.09k
>
> ------------------------------------------------------------------------
>
> L2ARC is disabled
>
> ------------------------------------------------------------------------
>
> File-Level Prefetch: (HEALTHY)
>
> DMU Efficiency: 26.49m
> Hit Ratio: 71.64% 18.98m
> Miss Ratio: 28.36% 7.51m
>
> Colinear: 7.51m
> Hit Ratio: 0.02% 1.42k
> Miss Ratio: 99.98% 7.51m
>
> Stride: 18.85m
> Hit Ratio: 99.97% 18.85m
> Miss Ratio: 0.03% 5.73k
>
> DMU Misc:
> Reclaim: 7.51m
> Successes: 0.29% 21.58k
> Failures: 99.71% 7.49m
>
> Streams: 130.46k
> +Resets: 0.35% 461
> -Resets: 99.65% 130.00k
> Bogus: 0
>
> ------------------------------------------------------------------------
>
> VDEV cache is disabled
>
> ------------------------------------------------------------------------
>
> ZFS Tunables (sysctl):
> kern.maxusers 384
> vm.kmem_size 4718592000
> vm.kmem_size_scale 1
> vm.kmem_size_min 0
> vm.kmem_size_max 329853485875
> vfs.zfs.l2c_only_size 0
> vfs.zfs.mfu_ghost_data_lsize 2705408
> vfs.zfs.mfu_ghost_metadata_lsize 332861440
> vfs.zfs.mfu_ghost_size 335566848
> vfs.zfs.mfu_data_lsize 1641984
> vfs.zfs.mfu_metadata_lsize 3048448
> vfs.zfs.mfu_size 28561920
> vfs.zfs.mru_ghost_data_lsize 68477440
> vfs.zfs.mru_ghost_metadata_lsize 62875648
> vfs.zfs.mru_ghost_size 131353088
> vfs.zfs.mru_data_lsize 1651216384
> vfs.zfs.mru_metadata_lsize 278577152
> vfs.zfs.mru_size 2306510848
> vfs.zfs.anon_data_lsize 0
> vfs.zfs.anon_metadata_lsize 0
> vfs.zfs.anon_size 12968960
> vfs.zfs.l2arc_norw 1
> vfs.zfs.l2arc_feed_again 1
> vfs.zfs.l2arc_noprefetch 1
> vfs.zfs.l2arc_feed_min_ms 200
> vfs.zfs.l2arc_feed_secs 1
> vfs.zfs.l2arc_headroom 2
> vfs.zfs.l2arc_write_boost 8388608
> vfs.zfs.l2arc_write_max 8388608
> vfs.zfs.arc_meta_limit 917504000
> vfs.zfs.arc_meta_used 851157616
> vfs.zfs.arc_min 458752000
> vfs.zfs.arc_max 3670016000
> vfs.zfs.dedup.prefetch 1
> vfs.zfs.mdcomp_disable 0
> vfs.zfs.write_limit_override 1048576000
> vfs.zfs.write_limit_inflated 25728073728
> vfs.zfs.write_limit_max 1072003072
> vfs.zfs.write_limit_min 33554432
> vfs.zfs.write_limit_shift 3
> vfs.zfs.no_write_throttle 0
> vfs.zfs.zfetch.array_rd_sz 1048576
> vfs.zfs.zfetch.block_cap 256
> vfs.zfs.zfetch.min_sec_reap 2
> vfs.zfs.zfetch.max_streams 8
> vfs.zfs.prefetch_disable 0
> vfs.zfs.mg_alloc_failures 8
> vfs.zfs.check_hostid 1
> vfs.zfs.recover 0
> vfs.zfs.txg.synctime_ms 1000
> vfs.zfs.txg.timeout 10
> vfs.zfs.scrub_limit 10
> vfs.zfs.vdev.cache.bshift 16
> vfs.zfs.vdev.cache.size 0
> vfs.zfs.vdev.cache.max 16384
> vfs.zfs.vdev.write_gap_limit 4096
> vfs.zfs.vdev.read_gap_limit 32768
> vfs.zfs.vdev.aggregation_limit 131072
> vfs.zfs.vdev.ramp_rate 2
> vfs.zfs.vdev.time_shift 6
> vfs.zfs.vdev.min_pending 4
> vfs.zfs.vdev.max_pending 10
> vfs.zfs.vdev.bio_flush_disable 0
> vfs.zfs.cache_flush_disable 0
> vfs.zfs.zil_replay_disable 0
> vfs.zfs.zio.use_uma 0
> vfs.zfs.version.zpl 5
> vfs.zfs.version.spa 28
> vfs.zfs.version.acl 1
> vfs.zfs.debug 0
> vfs.zfs.super_owner 0
>
> ------------------------------------------------------------------------
I see that you are limiting your arc.max to 3G but you have prefetch enabled.
You can try disabling this:
vfs.zfs.prefetch_disable=1
If things turn out better you can increase your arc.mac to 4G
Regards
--
George Kontostanos
Aicom telecoms ltdhttp://www.aisecure.net
More information about the freebsd-fs
mailing list