Memory allocation performance
Julian Elischer
julian at elischer.org
Thu Jan 31 23:07:51 PST 2008
Alexander Motin wrote:
> Julian Elischer пишет:
>> Alexander Motin wrote:
>>> Hi.
>>>
>>> While profiling netgraph operation on UP HEAD router I have found
>>> that huge amount of time it spent on memory allocation/deallocation:
>>>
>>> 0.14 0.05 132119/545292 ip_forward <cycle 1> [12]
>>> 0.14 0.05 133127/545292 fxp_add_rfabuf [18]
>>> 0.27 0.10 266236/545292 ng_package_data [17]
>>> [9]14.1 0.56 0.21 545292 uma_zalloc_arg [9]
>>> 0.17 0.00 545292/1733401 critical_exit <cycle 2> [98]
>>> 0.01 0.00 275941/679675 generic_bzero [68]
>>> 0.01 0.00 133127/133127 mb_ctor_pack [103]
>>>
>>> 0.15 0.06 133100/545266 mb_free_ext [22]
>>> 0.15 0.06 133121/545266 m_freem [15]
>>> 0.29 0.11 266236/545266 ng_free_item [16]
>>> [8]15.2 0.60 0.23 545266 uma_zfree_arg [8]
>>> 0.17 0.00 545266/1733401 critical_exit <cycle 2> [98]
>>> 0.00 0.04 133100/133100 mb_dtor_pack [57]
>>> 0.00 0.00 134121/134121 mb_dtor_mbuf [111]
>>>
>>> I have already optimized all possible allocation calls and those that
>>> left are practically unavoidable. But even after this kgmon tells
>>> that 30% of CPU time consumed by memory management.
>>>
>>> So I have some questions:
>>> 1) Is it real situation or just profiler mistake?
>>> 2) If it is real then why UMA is so slow? I have tried to replace it
>>> in some places with preallocated TAILQ of required memory blocks
>>> protected by mutex and according to profiler I have got _much_ better
>>> results. Will it be a good practice to replace relatively small UMA
>>> zones with preallocated queue to avoid part of UMA calls?
>>> 3) I have seen that UMA does some kind of CPU cache affinity, but
>>> does it cost so much that it costs 30% CPU time on UP router?
>>
>> given this information, I would add an 'item cache' in ng_base.c
>> (hmm do I already have one?)
>
> That was actually my second question. As there is only 512 items by
> default and they are small in size I can easily preallocate them all on
> boot. But is it a good way? Why UMA can't do just the same when I have
> created zone with specified element size and maximum number of objects?
> What is the principal difference?
>
who knows what uma does.. but if you do it yourself you know what the
overhead is.. :-)
More information about the freebsd-hackers
mailing list