Differences in malloc between 6 and 7?
Jason Evans
jasone at freebsd.org
Tue Mar 4 19:34:09 UTC 2008
gnn at freebsd.org wrote:
> One of the folks I'm working with found this. The following code,
> which yes, is just an example, is 1/2 as fast on 7.0-RELEASE as on
> 6.3. Where should I look to find out why?
There is a definite performance problem an arena_run_alloc(), but I'm
happy to report that it was fixed in -current a while back. I plan to
MFC malloc to RELENG_7 within the next few weeks.
In a nutshell, the arena_run_alloc() performance problem is due to using
a linear search to find sufficiently large runs of mapped (but
currently unused) pages. There are caching mechanisms that speed up the
searches to some degree, but there are still some linear aspects to the
algorithm, so as memory usage increases, the searches take progressively
longer. In -current, this problem is solved by maintaining red-black
trees, so that arena_run_alloc() does a O(lg n) tree search, rather than
a O(n) iterative search.
It's worth mentioning that the benchmark is of marginal use, due to a
simple (but common) flaw. At a minimum, a malloc benchmark should touch
all allocated memory at least once. Otherwise, the benchmark is IMO too
far removed from reality to measure anything of value, since memory
access patterns look nothing like those of an actual application that
dynamically allocates memory. Both phkmalloc and jemalloc use data
structures that are mostly disjunct from the allocations (no headers),
so the benchmark never even faults most pages in. This is especially
true for phkmalloc, so jemalloc is unjustly penalized. If we were to
include, say, dlmalloc in this comparison, it would be even more heavily
penalized due to touching the pages while modifying allocation headers.
Jason
More information about the freebsd-current
mailing list