hot path optimizations in uma_zalloc() & uma_zfree()
gurney_j at resnet.uoregon.edu
Thu Jun 30 17:42:12 GMT 2005
ant wrote this message on Thu, Jun 30, 2005 at 01:08 +0300:
> I just tryed to make buckets management in perCPU cache like in
> Solaris (see paper of Jeff Bonwick - Magazines and Vmem)
> and got perfomance gain around 10% in my test program.
> Then i made another minor code optimization and got another 10%.
> The program just creates and destroys sockets in loop.
> I suppose the reason of first gain lies in increasing of cpu cache hits.
> In current fbsd code allocations and freeings deal with
> separate buckets. Buckets are changed when one of them
> became full or empty first. In Solaris this work is pure LIFO:
> i.e. alloc() and free() work with one bucket - the current bucket
> (it is called magazine there), that's why cache hit rate is bigger.
If you do like the paper does, and use the buckets for allocating buckets,
I would recommend you drop the free bucket list from the pool... If
bucket allocations are as cheap as they are suppose to be, there is no
need to keep a local list of empty buckets.. :)
Just following the principal stated in the paper of letting well
optimized parts do their part...
P.S. I have most of a userland implementation of this done. Since someone
else has done kernel, I'll solely target userland for the code now.
John-Mark Gurney Voice: +1 415 225 5579
"All that I will do, has been done, All that I have, has not."
More information about the freebsd-hackers