kernel killing processes when out of swap

Tue Apr 12 08:09:25 PDT 2005

At 2005-04-12 14:26:40+0000, Marc Olzheim writes:
> On Tue, Apr 12, 2005 at 03:06:41PM +0100, Nick Barnes wrote:
> > The right choice is for mmap() to return ENOMEM, and then for malloc()
> > to return NULL, but almost no operating systems make this choice any
> > more.
> 
> No, the problem occurs only when previously allocated / mmap()d blocks
> are actually used (written) and when the total of virtual memory has
> been overcommitted: Physical pages are not allocated to processes at
> malloc() time, but at time of first usage (Copy On Write).

Yes, implicit in my statement is that the OS shouldn't overcommit.  I
remember when overcommit was new (maybe 1990), and some Unix (Irix,
perhaps, or AIX?) made it switchable.  There was a bit of flurry in
the OS community, as some people (myself included) felt that the OS
shouldn't make promises it couldn't fulfill, and that this "kill a
random process" behaviour was more of a bug than a solution.  Consider
a parallel design which allows (say) file descriptors to be
overcommitted.  You can open a billion files, but if you touch one of
them, that consumes a finite kernel resource, and if the kernel has
run out then a randomly chosen process gets killed.  Great.

> many programs have been programmed in a way that assumes this
> behaviour, for instance by sparsely using large allocations instead
> of adding the possible extra bookkeeping to allow for smaller
> allocations.

This is the well-known problem with my fantasy world in which the OS
doesn't overcommit any resources.  All those programs are broken, but
it's too costly to fix them.  If overcommit had been resisted more
effectively in the first place, those programs would have been written
properly.

My recollection, quite possibly faulty, is that FreeBSD came quite
late to the overcommit binge party.

Nick B