Memory allocation performance

Robert Watson rwatson at
Fri Feb 1 11:07:22 PST 2008

On Fri, 1 Feb 2008, Alexander Motin wrote:

> That was actually my second question. As there is only 512 items by default 
> and they are small in size I can easily preallocate them all on boot. But is 
> it a good way? Why UMA can't do just the same when I have created zone with 
> specified element size and maximum number of objects? What is the principal 
> difference?


I think we should drill down in the analysis a bit and see if we can figure 
out what's going on with UMA.  What UMA essentially does is ask the VM for 
pages, and then pack objects into pages.  It maintains some meta-data, and 
depending on the relative sizes of objects and pages, it may store it in the 
page or potentially elsewhere.  Either way, it looks very much an array of 
struct object.  It has a few extra layers of wrapping in order to maintain 
stats, per-CPU caches, object life cycle, etc.  When INVARIANTS is turned off, 
allocation from the per-CPU cache consists of pulling objects in and out of 
one of two per-CPU queues.  So I guess the question is: where are the cycles 
going?  Are we suffering excessive cache misses in managing the slabs?  Are 
you effectively "cycling through" objects rather than using a smaller set that 
fits better in the cache?  Is some bit of debugging enabled that shouldn't be, 
perhaps due to a failure of ifdefs?

BTW, UMA does let you set the size of buckets, so you can try tuning the 
bucket size.  For starts, try setting the zone flag UMA_ZONE_MAXBUCKET.

It would be very helpful if you could try doing some analysis with hwpmc -- 
"high resolution profiling" is of increasingly limited utility with modern 
CPUs, where even a high frequency timer won't run very often.  It's also quite 
subject to cycle events that align with other timers in the system.

Robert N M Watson
Computer Laboratory
University of Cambridge

More information about the freebsd-performance mailing list