vm.kmem_size_max and vm.kmem_size capped at 329853485875 (~307GB)

Alan Cox alc at rice.edu
Sat Aug 18 19:14:25 UTC 2012


On 08/17/2012 17:08, Gezeala M. Bacuño II wrote:
> On Fri, Aug 17, 2012 at 1:58 PM, Alan Cox<alc at rice.edu>  wrote:
>> vm.kmem_size controls the maximum size of the kernel's heap, i.e., the
>> region where the kernel's slab and malloc()-like memory allocators obtain
>> their memory.  While this heap may occupy the largest portion of the
>> kernel's virtual address space, it cannot occupy the entirety of the address
>> space.  There are other things that must be given space within the kernel's
>> address space, for example, the file system buffer map.
>>
>> ZFS does not, however, use the regular file system buffer cache. The ARC
>> takes its place, and the ARC abuses the kernel's heap like nothing else.
>> So, if you are running a machine that only makes trivial use of a non-ZFS
>> file system, like you boot from UFS, but store all of your data in ZFS, then
>> you can dramatically reduce the size of the buffer map via boot loader
>> tuneables and proportionately increase vm.kmem_size.
>>
>> Any further increases in the kernel virtual address space size will,
>> however, require code changes.  Small changes, but changes nonetheless.
>>
>> Alan
>>
>>
> <<snip>>
>
>>> Additional Info:
>>> 1] Installed using PCBSD-9 Release amd64.
>>>
>>> 2] uname -a
>>> FreeBSD fmt-iscsi-stg1.musicreports.com 9.0-RELEASE FreeBSD
>>> 9.0-RELEASE #3: Tue Dec 27 14:14:29 PST 2011
>>>
>>> root at build9x64.pcbsd.org:/usr/obj/builds/amd64/pcbsd-build90/fbsd-source/9.0/sys/GENERIC
>>>    amd64
>>>
>>> 3] first few lines from /var/run/dmesg.boot:
>>> FreeBSD 9.0-RELEASE #3: Tue Dec 27 14:14:29 PST 2011
>>>
>>> root at build9x64.pcbsd.org:/usr/obj/builds/amd64/pcbsd-build90/fbsd-source/9.0/sys/GENERIC
>>> amd64
>>> CPU: Intel(R) Xeon(R) CPU E7- 8837  @ 2.67GHz (2666.82-MHz K8-class CPU)
>>>     Origin = "GenuineIntel"  Id = 0x206f2  Family = 6  Model = 2f  Stepping
>>> = 2
>>>
>>> Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
>>>
>>> Features2=0x29ee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,POPCNT,AESNI>
>>>     AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
>>>     AMD Features2=0x1<LAHF>
>>>     TSC: P-state invariant, performance statistics
>>> real memory  = 549755813888 (524288 MB)
>>> avail memory = 530339893248 (505771 MB)
>>> Event timer "LAPIC" quality 600
>>> ACPI APIC Table:<ALASKA A M I>
>>> FreeBSD/SMP: Multiprocessor System Detected: 64 CPUs
>>> FreeBSD/SMP: 8 package(s) x 8 core(s)
>>>
>>> 4] relevant sysctl's with manual tuning:
>>> kern.maxusers: 384
>>> kern.maxvnodes: 8222162
>>> vfs.numvnodes: 675740
>>> vfs.freevnodes: 417524
>>> kern.ipc.somaxconn: 128
>>> kern.openfiles: 5238
>>> vfs.zfs.arc_max: 428422987776
>>> vfs.zfs.arc_min: 53552873472
>>> vfs.zfs.arc_meta_used: 3167391088
>>> vfs.zfs.arc_meta_limit: 107105746944
>>> vm.kmem_size_max: 429496729600    ==>>  manually tuned
>>> vm.kmem_size: 429496729600    ==>>  manually tuned
>>> vm.kmem_map_free: 107374727168
>>> vm.kmem_map_size: 144625156096
>>> vfs.wantfreevnodes: 2055540
>>> kern.minvnodes: 2055540
>>> kern.maxfiles: 197248    ==>>  manually tuned
>>> vm.vmtotal:
>>> System wide totals computed every five seconds: (values in kilobytes)
>>> ===============================================
>>> Processes:              (RUNQ: 1 Disk Wait: 1 Page Wait: 0 Sleep: 150)
>>> Virtual Memory:         (Total: 1086325716K Active: 12377876K)
>>> Real Memory:            (Total: 144143408K Active: 803432K)
>>> Shared Virtual Memory:  (Total: 81384K Active: 37560K)
>>> Shared Real Memory:     (Total: 32224K Active: 27548K)
>>> Free Memory Pages:      365565564K
>>>
>>> hw.availpages: 134170294
>>> hw.physmem: 549561524224
>>> hw.usermem: 391395241984
>>> hw.realmem: 551836188672
>>> vm.kmem_size_scale: 1
>>> kern.ipc.nmbclusters: 2560000    ==>>  manually tuned
>>> kern.ipc.maxsockbuf: 2097152
>>> net.inet.tcp.sendbuf_max: 2097152
>>> net.inet.tcp.recvbuf_max: 2097152
>>> kern.maxfilesperproc: 18000
>>> net.inet.ip.intr_queue_maxlen: 256
>>> kern.maxswzone: 33554432
>>> kern.ipc.shmmax: 10737418240    ==>>  manually tuned
>>> kern.ipc.shmall: 2621440    ==>>  manually tuned
>>> vfs.zfs.write_limit_override: 0
>>> vfs.zfs.prefetch_disable: 0
>>> hw.pagesize: 4096
>>> hw.availpages: 134170294
>>> kern.ipc.maxpipekva: 8586895360
>>> kern.ipc.shm_use_phys: 1    ==>>  manually tuned
>>> vfs.vmiodirenable: 1
>>> debug.numcache: 632148
>>> vfs.ncsizefactor: 2
>>> vm.kvm_size: 549755809792
>>> vm.kvm_free: 54456741888
>>> kern.ipc.semmni: 256
>>> kern.ipc.semmns: 512
>>> kern.ipc.semmnu: 256
>>>
> Thanks. It will be mainly used for postgreSQL and java. We have a huge
> db (3TB and growing) and we need to have as much of it as we can on
> zfs' ARC. All data resides on zpools while root is on ufs. On 8.2 and
> 9 machines vm.kmem_size is always auto-tuned to almost the same size
> as our installed RAM. What I've tuned on those machines is lower
> vfs.zfs.arc_max to 50% or 75% of vm.kmem_size and that have worked
> well for us and the machines does not swap out. Now on this machine, I
> do think that I need to adjust my formula for tuning vfs.zfs.arc_max,
> 25% for other stuff is probably overkill.
>
> We were able to successfully bump vm.kmem_size_max and vm.kmem_size to 400GB:
> vm.kmem_size_max: 429496729600    ==>>  manually tuned
> vm.kmem_size: 429496729600    ==>>  manually tuned
> vfs.zfs.arc_max: 428422987776  ==>>  auto-tuned (vm.kmem_size - 1G)
> vfs.zfs.arc_min: 53552873472  ==>>  auto-tuned
>
> Which other tuneables do I need to set on /boot/loader.conf so we can
> boot the machine with vm.kmem_size>  400G. As I don't know which part
> of the boot-up process is failing with vm.kmem_size/_max set to 450G
> or 500G, I have no idea which to tune next.


Your objective should be to reduce the value of "sysctl 
vfs.maxbufspace".  You can do this by setting the loader.conf tuneable 
"kern.maxbcache" to the desired value.

What does your machine currently report for "sysctl vfs.maxbufspace"?



More information about the freebsd-performance mailing list