[rfc] enabling MALLOC_PRODUCTION on -HEAD for now, until jemalloc has been taught to have some run time selectable debug options

Ian Lepore ian at FreeBSD.org
Mon Jan 21 03:26:43 UTC 2013


On Sat, 2013-01-19 at 22:26 -0800, Adrian Chadd wrote:
> Hi,
> 
> I'd like to enable MALLOC_PRODUCTION on -HEAD.
> 
> I'm currently recompiling my libc on this g4 powerbook because the
> -HEAD snapshots don't have it enabled by default; just to get some
> damned decent performance out of this thing.
> 
> I'll work with Jason and others (eg Ian) who have a vested interest in
> trying to get it to run better out of the box, but still have the
> debug options available for people who wish to debug things.
> 

I've been investigating this today and have some information.

With MALLOC_PRODUCTION defined there is no problem, even on small
embedded systems.  Without MALLOC_PRODUCTION we've basically got two
problems:

      * Every program has a minimum resident size of about 8MiB, and
        that's fatal on a small-memory embedded system.
      * Performance is bad.  This is at least in part due to the expense
        of faulting in 8MiB of zeroed pages, and that's especially
        noticible in utilities that should be small and fast.  There
        could be other causes as well.

I think I've tracked the cause of the 8MiB resident size to a particular
sanity check, which validates whether memory that was supposed to have
been zeroed actually was.  I think this check makes sense in some cases,
and not in others.  It almost certainly doesn't make sense if the memory
was freshly obtained from mmap().

I want to talk to Jason about a proper robust fix, but to help learn
more about the performance problem, I'm attaching a little test patch
that disables the suspect validity check.  It would be good if a few
folks running -current could apply this and build without
MALLOC_PRODUCTION defined, and see if the system feels more usable than
it does without the patch.  It's likely to make the most difference on a
slower or older system.

It's possible that this patch helps with the memory usage, but doesn't
help enough with performance.  I'm not in a good position to do
real-world performance testing myself right now.

In terms of non-real-world testing, I was using a trivial little app
that was basically:  int main(void) {malloc(64); return 0;} and a little
shell script to time running 100 iterations of that in a loop.  Without
the patch it took 24 seconds, with the patch 2 seconds, on a
medium-wimpy embeded arm system.  It's probably too much to hope that a
12:1 improvement will scale up to non-trivial apps.

-- Ian

-------------- next part --------------
A non-text attachment was scrubbed...
Name: jemalloc_test.diff
Type: text/x-patch
Size: 518 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-arch/attachments/20130120/6e70fbfd/attachment.bin>


More information about the freebsd-arch mailing list