ten thousand small processes
D. J. Bernstein
djb at cr.yp.to
Wed Jun 25 19:04:26 PDT 2003
Chuck Swiger writes:
> Remember that VMM hardware requires page-alignment
When I ask why the stack and data aren't put on the same page, and you
say ``They aren't on the same page,'' you aren't answering the question.
(As for adding an x bit to data: This obviously won't break anything.)
Here's an alternative layout that doesn't move the text. Subtract the
data+bss (or at least data) amount from the stack starting position, and
put the data+bss (or data; but not the heap, obviously) into that space.
This saves 78 megabytes of memory in the situation I'm talking about.
> Mach uses copy-on-write
I'm not talking about copy-on-write. I'm not talking about shared pages.
I'm talking about RAM being frittered away for memory-management tables
that, in this situation, could trivially be compressed by two orders of
magnitude. This is not rocket science.
Jon's ``dynamic page-table creation'' terminology is pretty good. Of
course, for processes with many pages of process-specific memory, the
page table should be cached rather than being shared among processes;
I'm not suggesting any change in how browser memory is handled.
> It's easy to write a memory allocator that performs a specific case well;
> writing a general purpose malloc is significantly more complicated,
I'm not talking about replacing malloc() with a special-purpose
allocator. I'm talking about adding a tiny bit of code to malloc() to
magically take advantage of space that is being ignored right now.
The savings in this situation go beyond those dozens of megabytes of
magically reacquired RAM. There's a nasty spike in memory usage as soon
as malloc() starts extending the heap; when a program's allocations fit
into the magically reacquired RAM, the program also avoids the spike.
---D. J. Bernstein, Associate Professor, Department of Mathematics,
Statistics, and Computer Science, University of Illinois at Chicago
More information about the freebsd-performance