sbrk(2), OOM-killer and malloc() overcommit

Vadim Goncharov vadim_nuclight at mail.ru
Mon Jan 7 08:06:40 PST 2008


07.01.08 @ 20:58 Dag-Erling Smørgrav wrote:

> "Vadim Goncharov" <vadim_nuclight at mail.ru> writes:
>> There were case in our town when on heavy loaded web-server apache
>> processes were dying on memory pressure - aforementioned man said that
>> was  due to overcommit and OOM killer working.
>
> Well, technically, it was because the server didn't have enough RAM for
> the workload it was given.  Turning off memory overcommit wouldn't fix
> that, it would just change the symptoms.

Yes, but when they have a multi-gigabyte-RAM server, and told that Linux  
will be better - no matter they are technically so competent or not, we  
can loose users...

> I don't know of a single server OS that doesn't overcommit memory.  The
> only difference between them is how they behave once the shit hits the
> fan.

I've heard about disabling it for selected processes or things like memory  
reservation backed by temporary files done by OS (afair, it was HP-UX). Or  
Linux overcommit switch, for which this ordinary people are happy enough  
to not blame (here are defaults):

master:~# cat /proc/sys/vm/overcommit_ratio
50
master:~# cat /proc/sys/vm/overcommit_memory
0
master:~# cat  
/usr/src/linux-2.6.16.21-0.8/Documentation/vm/overcommit-accounting
The Linux kernel supports the following overcommit handling modes

0       -       Heuristic overcommit handling. Obvious overcommits of
                 address space are refused. Used for a typical system. It
                 ensures a seriously wild allocation fails while allowing
                 overcommit to reduce swap usage.  root is allowed to
                 allocate slighly more memory in this mode. This is the
                 default.

1       -       Always overcommit. Appropriate for some scientific
                 applications.

2       -       Don't overcommit. The total address space commit
                 for the system is not permitted to exceed swap + a
                 configurable percentage (default is 50) of physical RAM.
                 Depending on the percentage you use, in most situations
                 this means a process will not be killed while accessing
                 pages but will receive errors on memory allocation as
                 appropriate.

The overcommit policy is set via the sysctl `vm.overcommit_memory'.

The overcommit percentage is set via `vm.overcommit_ratio'.

The current overcommit limit and amount committed are viewable in
/proc/meminfo as CommitLimit and Committed_AS respectively.

Gotchas
-------

The C language stack growth does an implicit mremap. If you want absolute
guarantees and run close to the edge you MUST mmap your stack for the
largest size you think you will need. For typical stack usage this does
not matter much but it's a corner case if you really really care

In mode 2 the MAP_NORESERVE flag is ignored.


How It Works
------------

The overcommit is based on the following rules

For a file backed map
         SHARED or READ-only     -       0 cost (the file is the map not  
swap)
         PRIVATE WRITABLE        -       size of mapping per instance

For an anonymous or /dev/zero map
         SHARED                  -       size of mapping
         PRIVATE READ-only       -       0 cost (but of little use)
         PRIVATE WRITABLE        -       size of mapping per instance

Additional accounting
         Pages made writable copies by mmap
         shmfs memory drawn from the same pool

Status
------

o       We account mmap memory mappings
o       We account mprotect changes in commit
o       We account mremap changes in size
o       We account brk
o       We account munmap
o       We report the commit status in /proc
o       Account and check on fork
o       Review stack handling/building on exec
o       SHMfs accounting
o       Implement actual limit enforcement

To Do
-----
o       Account ptrace pages (this is hard)
master:~#


> Anyway, as somebody else mentioned, the details are in the archives - if
> you don't know enough English to find them there, I don't see how having
> them summarized in English will help.  If the language barrier really is
> a problem, ask someone who speaks your language to help you.

1) It is simpler to understand a relatively short summary or ask help from  
language-knowing man for this not so big text, than try to ask such man  
for many pages

2) Such an article would be good in terms of advocacy, to explain why this  
is not a bug, we are not-worse-in-this-place than their-cool-OS, and what  
programmer should do. Of course, in case this is really so - or maybe we  
want to commit a patch (Kostik Belousov's one may be good start point)  
with tunable allowing to disable overcommit?..

-- 
WBR, Vadim Goncharov


More information about the freebsd-current mailing list