kernel memory allocator: UMA or malloc?
Rick Macklem
rmacklem at uoguelph.ca
Thu Mar 13 22:22:21 UTC 2014
John-Mark Gurney wrote:
> Rick Macklem wrote this message on Wed, Mar 12, 2014 at 21:59 -0400:
> > John-Mark Gurney wrote:
> > > Rick Macklem wrote this message on Tue, Mar 11, 2014 at 21:32
> > > -0400:
> > > > I've been working on a patch provided by wollman@, where
> > > > he uses UMA instead of malloc() to allocate an iovec array
> > > > for use by the NFS server's read.
> > > >
> > > > So, my question is:
> > > > When is it preferable to use UMA(9) vs malloc(9) if the
> > > > allocation is going to be a fixed size?
> > >
> > > UMA has benefits if the structure size is uniform and a non-power
> > > of
> > > 2..
> > > In this case, it can pack the items more densely, say, a 192 byte
> > > allocation can fit 21 allocations in a 4k page size verse malloc
> > > which
> > > would round it up to 256 bytes leaving only 16 per page... These
> > > counts per page are probably different as UMA may keep some
> > > information
> > > in the page...
> > >
> > Ok, this one might apply. I need to look at the size.
> >
> > > It also has the benefit of being able to keep allocations "half
> > > alive"..
> > > "freed" objects can be partly initalized with references to
> > > buffers
> > > and
> > > other allocations still held by them... Then if the systems needs
> > > to
> > > fully free your allocation, it can, and will call your function
> > > to
> > > release these remaining resources... look at the ctor/dtor
> > > uminit/fini
> > > functions in uma(9) for more info...
> > >
> > > uma also allows you to set a hard limit on the number of
> > > allocations
> > > the zone provides...
> > >
> > Yep. None of the above applies to this case, but thanks for the
> > good points
> > for a future case. (I've seen where this gets used for the
> > "secondary zone"
> > for mbufs+cluster.)
> >
> > > Hope this helps...
> > >
> > Yes, it did. Thanks.
> >
> > Does anyone know if there is a significant performance difference
> > if the allocation
> > is a power of 2 and the "half alive" cases don't apply?
>
> From my understanding, the malloc case is "slightly" slower as it
> needs to look up which bucket to use, but after the lookup, the
> buckets
> are UMA, so the performance will be the same...
>
> > Thanks all for your help, rick
> > ps: Garrett's patch switched to using a fixed size allocation and
> > using UMA(9).
> > Since I have found that a uma allocation request with M_WAITOK
> > can get the thread
> > stuck sleeping in "btalloc", I am a bit shy of using it when
> > I've never
>
> Hmm... I took a look at the code, and if you're stuck in btalloc,
> either pause(9) isn't working, or you're looping, which probably
> means
> you're really low on memory...
>
Well, this was an i386 with the default of about 400Mbytes of kernel
memory (address space if I understand it correctly). Since it seemed
to persist in this state, I assumed that it was looping and, therefore,
wasn't able to find a page sized and page aligned chunk of kernel
address space to use. (The rest of the system was still running ok.)
I did email about this and since no one had a better explanation/fix,
I avoided the problem by using M_NOWAIT on the m_getjcl() call.
Although I couldn't reproduce this reliably, it seemed to happen more
easily when my code was doing a mix of MCLBYTES and MJUMPAGESIZE cluster
allocation. Again, just a hunch, but maybe the MCLBYTE cluster allocations
were fragmenting the address space to the point where a page sized chunk
aligned to a page boundary couldn't be found.
Alternately, the code for M_WAITOK is broken in some way not obvious
to me.
Either way, I avoid it by using M_NOWAIT. I also fall back on:
MGET(..M_WAITOK);
MCLGET(..M_NOWAIT);
which has a "side effect" of draining the mbuf cluster zone if the
MCLGET(..M_NOWAIT) fails to get a cluster. (For some reason m_getcl()
and m_getjcl() do not drain the cluster zone when they fail?)
One of the advantages of having very old/small hardware to test on;-)
> > had a problem with malloc(). Btw, this was for a pagesize
> > cluster allocation,
> > so it might be related to the alignment requirement (and
> > running on a small
> > i386, so the kernel address space is relatively small).
>
> Yeh, if you put additional alignment requirements, that's probably
> it,
> but if you needed these alignment requirements, how was malloc
> satisfying your request?
>
This was for a m_getjcl(MJUMAGEIZE, M_WAITOK..), so for this case
I've never done a malloc(). The code in head (which my patch uses as
a fallback when m_getjcl(..M_NOWAIT..) fails does (as above):
MGET(..M_WAITOK);
MCLGET(..M_NOWAIT);
> > I do see that switching to a fixed size allocation to cover the
> > common
> > case is a good idea, but I'm not sure if setting up a uma zone
> > is worth
> > the effort over malloc()?
>
> I'd say it depends upon how many and the number... If you're
> allocating
> many megabytes of memory, and the wastage is 50%+, then think about
> it, but if it's just a few objects, then the coding time and
> maintenance isn't worth it..
>
Btw, I think the allocation is a power of 2. (It is a power of 2 times
sizeof(struct iovec) and it looks to me that sizeof(struct iovec) is
a power of 2 as well. (I know i386 is 8 and I think most 64bits arches
will make it 16, since it is a pointer and a size_t.)
This was part of Garrett's patch, so I'll admit I would have been to
lazy to do it.;-) Now it's in the current patch, so unless there seems
to be a reason to take it out..??
Garrett mentioned that UMA(9) has a per-CPU cache. I'll admit I don't
know what that implies?
- I might guess that a per-CPU cache would be useful for items that get
re-allocated a lot with minimal change to the data in the slab.
--> It seems to me that if most of the bytes in the slab have the
same bits, then you might improve hit rate on the CPU's memory
caches, but since I haven't looked at this, I could be way off??
- For this case, the iovec array that is allocated is filled in with
different mbuf data addresses each time, so minimal change doesn`t
apply.
- Does the per-CPU cache help w.r.t. UMA(9) internal code perf?
So, lots of questions that I don't have an answer for. However, unless
there is a downside to using UMA(9) for this, the code is written and
I'm ok with it.
Thanks for all the good comments, rick
> --
> John-Mark Gurney Voice: +1 415 225 5579
>
> "All that I will do, has been done, All that I have, has not."
> _______________________________________________
> freebsd-hackers at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to
> "freebsd-hackers-unsubscribe at freebsd.org"
>
More information about the freebsd-hackers
mailing list