ZFS and mem management
Pavlo
devgs at ukr.net
Wed Feb 15 10:11:24 UTC 2012
Hello.
We have an issue with memory management on FreeBSD and i suspect it is
related to FS.
We are using ZFS, here quick stats:
zpool status
pool: disk1
state: ONLINE
scan: resilvered 657G in 8h30m with 0 errors on Tue Feb 14 21:17:37 2012
config:
NAME STATE READ WRITE CKSUM
disk1 ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gpt/disk0 ONLINE 0 0 0
gpt/disk1 ONLINE 0 0 0
gpt/disk2 ONLINE 0 0 0
gpt/disk4 ONLINE 0 0 0
gpt/disk6 ONLINE 0 0 0
gpt/disk8 ONLINE 0 0 0
gpt/disk10 ONLINE 0 0 0
gpt/disk12 ONLINE 0 0 0
mirror-7 ONLINE 0 0 0
gpt/disk14 ONLINE 0 0 0
gpt/disk15 ONLINE 0 0 0
errors: No known data errors
pool: zroot
state: ONLINE
scan: resilvered 34.9G in 0h11m with 0 errors on Tue Feb 14 12:57:52 2012
config:
NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gpt/sys0 ONLINE 0 0 0
gpt/sys1 ONLINE 0 0 0
errors: No known data errors
------------------------------------------------------------------------
System Memory:
0.95% 75.61 MiB Active, 0.24% 19.02 MiB Inact
18.25% 1.41 GiB Wired, 0.01% 480.00 KiB Cache
80.54% 6.24 GiB Free, 0.01% 604.00 KiB Gap
Real Installed: 8.00 GiB
Real Available: 99.84% 7.99 GiB
Real Managed: 96.96% 7.74 GiB
Logical Total: 8.00 GiB
Logical Used: 21.79% 1.74 GiB
Logical Free: 78.21% 6.26 GiB
Kernel Memory: 1.18 GiB
Data: 99.05% 1.17 GiB
Text: 0.95% 11.50 MiB
Kernel Memory Map: 4.39 GiB
Size: 23.32% 1.02 GiB
Free: 76.68% 3.37 GiB
------------------------------------------------------------------------
------------------------------------------------------------------------
ZFS Subsystem Report Wed Feb 15 10:53:03 2012
------------------------------------------------------------------------
System Information:
Kernel Version: 802516 (osreldate)
Hardware Platform: amd64
Processor Architecture: amd64
ZFS Storage pool Version: 28
ZFS Filesystem Version: 5
FreeBSD 8.2-STABLE #12: Thu Feb 9 11:35:23 EET 2012 root
10:53AM up 56 mins, 6 users, load averages: 0.00, 0.00, 0.00
------------------------------------------------------------------------
Background:
we are using some tool that does indexing of some data and then pushes it
into database (currently bdb-5.2). Instances of indexer are running
continuously one after another. Time of indexing for one instance of
indexer may vary between 2 seconds and 30 minutes. But mostly it is
below one minute. There is nothing else running on the machine except
system stuff and daemons. After several hours of indexing i can see a lot
of active memory, it's ok. Then i check the number of vnodes. and it's
really huge: 300k+ even tho nobody has so many opened files. Reading docs
and googling i figured that's because of cached pages that reside in
memory (unmounting of disk causes whole memory to be freed). Also I
figured that happens only when I am accessing files via mmap().
Looks like pretty legit behaviour but the issue is:
This spectacle continues (approximately for 12 hours) unlit indexers
began to be killed out of swap. As I wrote above I observe a lot of used
vnodes and like 7GB of active memory. I made a tool that allocates memory
using malloc() to check what's the limit of available memory that can be
allocated. It is several megabytes, sometimes more. Unless that tool gets
killed out of swap as well. So how i can see the issue: for some reason
after some process had exited normally all mapped pages don't get freed.
I red about and I agree that this is reasonable behaviour if we have
spare memory. But following this logic these pages can be flushed back to
file at any time when system is under stress conditions. So when I ask
for a piece of RAM, OS should do that trick and give me what I ask. But
that's never happens. Those pages are like frozen. Until I unmount disk.
Even after there is not a single instance of indexer running.
I believe all this is caused by mmap() for sure : BDB uses mmap() for
accessing databases and we tested indexing with out pushing data to DB.
Worked shiny. You may suggest that that's something wrong with BDB. But
we have some more tools of ours that using mmap() as well and the
behaviour is exact.
Thank you. Paul, Ukraine.
More information about the freebsd-fs
mailing list