Jason Evans jasone at
Thu Feb 7 21:39:16 UTC 2008

I've been working on jemalloc a bunch lately, as a result of working 
with the Mozilla folks to integrate it with Firefox.  One of the 
problems we ran into is that on Windows, there is no good way to tell 
the resident set size of an individual application, unless we "decommit" 
unused pages.  I won't go into the details of Windows memory management, 
but suffice it to say that the solution to this problem is also a 
reasonable solution to the problem of expensive madvise(... MADV_FREE) 
calls on FreeBSD.

I recently committed code to FreeBSD that tracks whether each page 
within a mapped chunk is unused and dirty.  If the number of such dirty 
pages exceeds a threshold, jemalloc sweeps downward through memory and 
calls madvise() on enough dirty pages to drop the dirty page count to no 
more than half of the threshold value.  The default threshold value is 
currently 512 pages per arena (2 MB), but it can be tuned via 
MALLOC_OPTIONS=F or f.  See the malloc(3) man page for details.

By sweeping downward through memory, jemalloc tends to call madvise() on 
pages that are less likely to be reused soon.  Also, by delaying the 
madvise() calls, unused pages tend to coalesce, thus reducing the total 
number of calls.  Following are some statistics from a contrived test 
(repeatedly opening and closing a 36 MB file within vim):

dirty: 119 pages dirty, 45 sweeps, 117 madvises, 20479 pages purged
             allocated      nmalloc      ndalloc
small:         428216        64195        53915
large:         188416        41419        41404
total:         616632       105614        95319

I've been seeing madvise():pages purged ratios of 1:100+ for the tests 
I've run, so this mechanism appears to typically be pretty cheap.

Anyway, the reason I think this change to jemalloc matters is that it 
inexpensively puts pretty reasonable bounds on how much dirty unused 
memory the entire OS (all processes) has lying around, without requiring 
any complex interactions with the kernel.  I've been thinking a lot 
about this problem since the discussion here last month (see "sbrk(2) 
broken" thread), and the 100% solutions like receiving notifications 
from the kernel are in my opinion prohibitively complex in the context 
of multi-threaded applications.


More information about the freebsd-current mailing list