New libc malloc patch

Sun Dec 11 19:31:17 PST 2005

On Dec 11, 2005, at 7:16 PM, Johan Bucht wrote:
> Jason Evans wrote:
>>
>> On Dec 11, 2005, at 5:48 PM, Johan Bucht wrote:
>>> * Saw you used a tree to look up the allocation sizes at free, this
>>> * differs from how I and Solaris libumem does it. Instead I went for
>>> * prepending a 8/16 byte tag containing the size of the  
>>> allocation for
>>> * lockless size lookup.
>>
>> Actually, my implementation prepends a 4 byte tag that contains   
>> allocated/free flags, and the size of the allocation.  The trees  
>> are  only used to look up the sizes of huge allocations (> 8 MB on  
>> 32-bit  platforms, and > 256 MB on 64-bit platforms).  Huge  
>> allocations start  at chunk boundaries, so prepending a tag would  
>> require an entire  extra page (and would cause potentially serious  
>> external  fragmentation on 32-bit systems).
>>
> Isn't 8 byte alignment expected by some applications?

Yes, 8 or 16 byte alignment is expected (in fact I'm of the opinion  
that we should be using 16 byte alignment on i386 due to SSE2/SSE3).   
If I understand your question correctly, you're asking how I get away  
with 4 byte tags.  Consider that (assuming 8-byte quantum) a malloc 
(16) call must actually be serviced by at least 24 bytes internally,  
in order to pad to the next quantum size.  If I used 8 byte tags,  
then malloc(17) would require 32 bytes internally.  By using 4 byte  
tags, malloc(13)..malloc(20) can be serviced by 24 bytes internally.   
At least one implementation (the OS X malloc implementation) uses 2  
byte tags, but this has two problems: 1) unaligned memory access is  
slow on some architectures, and 2) it's too small to handle large  
object sizes, so a different allocation strategy has to be used  
starting at ~12 KB.

> How do you know if a allocation is huge if you don't have a tag?

I know an allocation is huge if its base address is chunk-aligned.   
The actual size is stored in a red-black tree node, which is  
allocated separately.

Jason