ZFS and mem management
Peter Maloney
peter.maloney at brockmann-consult.de
Wed Feb 15 11:36:46 UTC 2012
Can you also post:
zpool get all <poolname>
And does your indexing scan through the .zfs/snapshot directory? If so,
this is a known issue that totally eats your memory, resulting in swap
space errors.
On 02/15/2012 11:28 AM, Pavlo wrote:
>
>
> Hey George,
>
> thanks for quick response.
>
> No, no dedup is used.
>
> zfs-stats -a :
>
> ------------------------------------------------------------------------
> ZFS Subsystem Report Wed Feb 15 12:26:18 2012
> ------------------------------------------------------------------------
>
> System Information:
>
> Kernel Version: 802516 (osreldate)
> Hardware Platform: amd64
> Processor Architecture: amd64
>
> ZFS Storage pool Version: 28
> ZFS Filesystem Version: 5
>
> FreeBSD 8.2-STABLE #12: Thu Feb 9 11:35:23 EET 2012 root
> 12:26PM up 2:29, 7 users, load averages: 0.02, 0.16, 0.16
>
> ------------------------------------------------------------------------
>
> System Memory:
>
> 19.78% 1.53 GiB Active, 0.95% 75.21 MiB Inact
> 36.64% 2.84 GiB Wired, 0.06% 4.83 MiB Cache
> 42.56% 3.30 GiB Free, 0.01% 696.00 KiB Gap
>
> Real Installed: 8.00 GiB
> Real Available: 99.84% 7.99 GiB
> Real Managed: 96.96% 7.74 GiB
>
> Logical Total: 8.00 GiB
> Logical Used: 57.82% 4.63 GiB
> Logical Free: 42.18% 3.37 GiB
>
> Kernel Memory: 2.43 GiB
> Data: 99.54% 2.42 GiB
> Text: 0.46% 11.50 MiB
>
> Kernel Memory Map: 3.16 GiB
> Size: 69.69% 2.20 GiB
> Free: 30.31% 979.48 MiB
>
> ------------------------------------------------------------------------
>
> ARC Summary: (THROTTLED)
> Memory Throttle Count: 3.82k
>
> ARC Misc:
> Deleted: 874.34k
> Recycle Misses: 376.12k
> Mutex Misses: 4.74k
> Evict Skips: 4.74k
>
> ARC Size: 68.53% 2.34 GiB
> Target Size: (Adaptive) 68.54% 2.34 GiB
> Min Size (Hard Limit): 12.50% 437.50 MiB
> Max Size (High Water): 8:1 3.42 GiB
>
> ARC Size Breakdown:
> Recently Used Cache Size: 92.95% 2.18 GiB
> Frequently Used Cache Size: 7.05% 169.01 MiB
>
> ARC Hash Breakdown:
> Elements Max: 229.96k
> Elements Current: 40.05% 92.10k
> Collisions: 705.52k
> Chain Max: 11
> Chains: 20.64k
>
> ------------------------------------------------------------------------
>
> ARC Efficiency: 7.96m
> Cache Hit Ratio: 84.92% 6.76m
> Cache Miss Ratio: 15.08% 1.20m
> Actual Hit Ratio: 76.29% 6.08m
>
> Data Demand Efficiency: 91.32% 4.99m
> Data Prefetch Efficiency: 19.57% 134.19k
>
> CACHE HITS BY CACHE LIST:
> Anonymously Used: 7.24% 489.41k
> Most Recently Used: 25.29% 1.71m
> Most Frequently Used: 64.54% 4.37m
> Most Recently Used Ghost: 1.42% 95.77k
> Most Frequently Used Ghost: 1.51% 102.33k
>
> CACHE HITS BY DATA TYPE:
> Demand Data: 67.42% 4.56m
> Prefetch Data: 0.39% 26.26k
> Demand Metadata: 22.41% 1.52m
> Prefetch Metadata: 9.78% 661.25k
>
> CACHE MISSES BY DATA TYPE:
> Demand Data: 36.11% 433.60k
> Prefetch Data: 8.99% 107.94k
> Demand Metadata: 32.00% 384.29k
> Prefetch Metadata: 22.91% 275.09k
>
> ------------------------------------------------------------------------
>
> L2ARC is disabled
>
> ------------------------------------------------------------------------
>
> File-Level Prefetch: (HEALTHY)
>
> DMU Efficiency: 26.49m
> Hit Ratio: 71.64% 18.98m
> Miss Ratio: 28.36% 7.51m
>
> Colinear: 7.51m
> Hit Ratio: 0.02% 1.42k
> Miss Ratio: 99.98% 7.51m
>
> Stride: 18.85m
> Hit Ratio: 99.97% 18.85m
> Miss Ratio: 0.03% 5.73k
>
> DMU Misc:
> Reclaim: 7.51m
> Successes: 0.29% 21.58k
> Failures: 99.71% 7.49m
>
> Streams: 130.46k
> +Resets: 0.35% 461
> -Resets: 99.65% 130.00k
> Bogus: 0
>
> ------------------------------------------------------------------------
>
> VDEV cache is disabled
>
> ------------------------------------------------------------------------
>
> ZFS Tunables (sysctl):
> kern.maxusers 384
> vm.kmem_size 4718592000
> vm.kmem_size_scale 1
> vm.kmem_size_min 0
> vm.kmem_size_max 329853485875
> vfs.zfs.l2c_only_size 0
> vfs.zfs.mfu_ghost_data_lsize 2705408
> vfs.zfs.mfu_ghost_metadata_lsize 332861440
> vfs.zfs.mfu_ghost_size 335566848
> vfs.zfs.mfu_data_lsize 1641984
> vfs.zfs.mfu_metadata_lsize 3048448
> vfs.zfs.mfu_size 28561920
> vfs.zfs.mru_ghost_data_lsize 68477440
> vfs.zfs.mru_ghost_metadata_lsize 62875648
> vfs.zfs.mru_ghost_size 131353088
> vfs.zfs.mru_data_lsize 1651216384
> vfs.zfs.mru_metadata_lsize 278577152
> vfs.zfs.mru_size 2306510848
> vfs.zfs.anon_data_lsize 0
> vfs.zfs.anon_metadata_lsize 0
> vfs.zfs.anon_size 12968960
> vfs.zfs.l2arc_norw 1
> vfs.zfs.l2arc_feed_again 1
> vfs.zfs.l2arc_noprefetch 1
> vfs.zfs.l2arc_feed_min_ms 200
> vfs.zfs.l2arc_feed_secs 1
> vfs.zfs.l2arc_headroom 2
> vfs.zfs.l2arc_write_boost 8388608
> vfs.zfs.l2arc_write_max 8388608
> vfs.zfs.arc_meta_limit 917504000
> vfs.zfs.arc_meta_used 851157616
> vfs.zfs.arc_min 458752000
> vfs.zfs.arc_max 3670016000
> vfs.zfs.dedup.prefetch 1
> vfs.zfs.mdcomp_disable 0
> vfs.zfs.write_limit_override 1048576000
> vfs.zfs.write_limit_inflated 25728073728
> vfs.zfs.write_limit_max 1072003072
> vfs.zfs.write_limit_min 33554432
> vfs.zfs.write_limit_shift 3
> vfs.zfs.no_write_throttle 0
> vfs.zfs.zfetch.array_rd_sz 1048576
> vfs.zfs.zfetch.block_cap 256
> vfs.zfs.zfetch.min_sec_reap 2
> vfs.zfs.zfetch.max_streams 8
> vfs.zfs.prefetch_disable 0
> vfs.zfs.mg_alloc_failures 8
> vfs.zfs.check_hostid 1
> vfs.zfs.recover 0
> vfs.zfs.txg.synctime_ms 1000
> vfs.zfs.txg.timeout 10
> vfs.zfs.scrub_limit 10
> vfs.zfs.vdev.cache.bshift 16
> vfs.zfs.vdev.cache.size 0
> vfs.zfs.vdev.cache.max 16384
> vfs.zfs.vdev.write_gap_limit 4096
> vfs.zfs.vdev.read_gap_limit 32768
> vfs.zfs.vdev.aggregation_limit 131072
> vfs.zfs.vdev.ramp_rate 2
> vfs.zfs.vdev.time_shift 6
> vfs.zfs.vdev.min_pending 4
> vfs.zfs.vdev.max_pending 10
> vfs.zfs.vdev.bio_flush_disable 0
> vfs.zfs.cache_flush_disable 0
> vfs.zfs.zil_replay_disable 0
> vfs.zfs.zio.use_uma 0
> vfs.zfs.version.zpl 5
> vfs.zfs.version.spa 28
> vfs.zfs.version.acl 1
> vfs.zfs.debug 0
> vfs.zfs.super_owner 0
>
> ------------------------------------------------------------------------
>
>
>
>
>
> 2012/2/15 Pavlo <devgs at ukr.net>:
>>
>>
>> Hello.
>>
>> We have an issue with memory management on FreeBSD and i suspect it is
>> related to FS.
>> We are using ZFS, here quick stats:
>>
>>
>> zpool status
>> pool: disk1
>> state: ONLINE
>> scan: resilvered 657G in 8h30m with 0 errors on Tue Feb 14 21:17:37 2012
>> config:
>>
>> NAME STATE READ WRITE CKSUM
>> disk1 ONLINE 0 0 0
>> mirror-0 ONLINE 0 0 0
>> gpt/disk0 ONLINE 0 0 0
>> gpt/disk1 ONLINE 0 0 0
>> gpt/disk2 ONLINE 0 0 0
>> gpt/disk4 ONLINE 0 0 0
>> gpt/disk6 ONLINE 0 0 0
>> gpt/disk8 ONLINE 0 0 0
>> gpt/disk10 ONLINE 0 0 0
>> gpt/disk12 ONLINE 0 0 0
>> mirror-7 ONLINE 0 0 0
>> gpt/disk14 ONLINE 0 0 0
>> gpt/disk15 ONLINE 0 0 0
>>
>> errors: No known data errors
>>
>> pool: zroot
>> state: ONLINE
>> scan: resilvered 34.9G in 0h11m with 0 errors on Tue Feb 14 12:57:52 2012
>> config:
>>
>> NAME STATE READ WRITE CKSUM
>> zroot ONLINE 0 0 0
>> mirror-0 ONLINE 0 0 0
>> gpt/sys0 ONLINE 0 0 0
>> gpt/sys1 ONLINE 0 0 0
>>
>> errors: No known data errors
>>
>> ------------------------------------------------------------------------
>>
>> System Memory:
>>
>> 0.95% 75.61 MiB Active, 0.24% 19.02 MiB Inact
>> 18.25% 1.41 GiB Wired, 0.01% 480.00 KiB Cache
>> 80.54% 6.24 GiB Free, 0.01% 604.00 KiB Gap
>>
>> Real Installed: 8.00 GiB
>> Real Available: 99.84% 7.99 GiB
>> Real Managed: 96.96% 7.74 GiB
>>
>> Logical Total: 8.00 GiB
>> Logical Used: 21.79% 1.74 GiB
>> Logical Free: 78.21% 6.26 GiB
>>
>> Kernel Memory: 1.18 GiB
>> Data: 99.05% 1.17 GiB
>> Text: 0.95% 11.50 MiB
>>
>> Kernel Memory Map: 4.39 GiB
>> Size: 23.32% 1.02 GiB
>> Free: 76.68% 3.37 GiB
>>
>> ------------------------------------------------------------------------
>>
>> ------------------------------------------------------------------------
>> ZFS Subsystem Report Wed Feb 15 10:53:03 2012
>> ------------------------------------------------------------------------
>>
>> System Information:
>>
>> Kernel Version: 802516 (osreldate)
>> Hardware Platform: amd64
>> Processor Architecture: amd64
>>
>> ZFS Storage pool Version: 28
>> ZFS Filesystem Version: 5
>>
>> FreeBSD 8.2-STABLE #12: Thu Feb 9 11:35:23 EET 2012 root
>> 10:53AM up 56 mins, 6 users, load averages: 0.00, 0.00, 0.00
>>
>> ------------------------------------------------------------------------
>>
>>
>>
>>
>> Background:
>> we are using some tool that does indexing of some data and then pushes it
>> into database (currently bdb-5.2). Instances of indexer are running
>> continuously one after another. Time of indexing for one instance of
>> indexer may vary between 2 seconds and 30 minutes. But mostly it is
>> below one minute. There is nothing else running on the machine except
>> system stuff and daemons. After several hours of indexing i can see a lot
>> of active memory, it's ok. Then i check the number of vnodes. and it's
>> really huge: 300k+ even tho nobody has so many opened files. Reading docs
>> and googling i figured that's because of cached pages that reside in
>> memory (unmounting of disk causes whole memory to be freed). Also I
>> figured that happens only when I am accessing files via mmap().
>>
>> Looks like pretty legit behaviour but the issue is:
>> This spectacle continues (approximately for 12 hours) unlit indexers
>> began to be killed out of swap. As I wrote above I observe a lot of used
>> vnodes and like 7GB of active memory. I made a tool that allocates memory
>> using malloc() to check what's the limit of available memory that can be
>> allocated. It is several megabytes, sometimes more. Unless that tool gets
>> killed out of swap as well. So how i can see the issue: for some reason
>> after some process had exited normally all mapped pages don't get freed.
>> I red about and I agree that this is reasonable behaviour if we have
>> spare memory. But following this logic these pages can be flushed back to
>> file at any time when system is under stress conditions. So when I ask
>> for a piece of RAM, OS should do that trick and give me what I ask. But
>> that's never happens. Those pages are like frozen. Until I unmount disk.
>> Even after there is not a single instance of indexer running.
>>
>> I believe all this is caused by mmap() for sure : BDB uses mmap() for
>> accessing databases and we tested indexing with out pushing data to DB.
>> Worked shiny. You may suggest that that's something wrong with BDB. But
>> we have some more tools of ours that using mmap() as well and the
>> behaviour is exact.
>>
>> Thank you. Paul, Ukraine.
>> _______________________________________________
>> freebsd-fs at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
> Hi Paul,
>
> Are you using dedup anywhere on that pool?
>
> Also, could you please post the full zfs-stats -a
>
>
--
--------------------------------------------
Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: peter.maloney at brockmann-consult.de
Internet: http://www.brockmann-consult.de
--------------------------------------------
More information about the freebsd-fs
mailing list