svn commit: r242014 - head/sys/kern

Andre Oppermann andre at freebsd.org
Thu Oct 25 16:23:52 UTC 2012


On 25.10.2012 05:49, Bruce Evans wrote:
> On Wed, 24 Oct 2012, Attilio Rao wrote:
>
>> On Wed, Oct 24, 2012 at 8:16 PM, Andre Oppermann <andre at freebsd.org> wrote:
>>> ...
>>> Let's go back and see how we can do this the sanest way.  These are
>>> the options I see at the moment:
>>>
>>>  1. sprinkle __aligned(CACHE_LINE_SIZE) all over the place
>>
>> This is wrong because it doesn't give padding.
>
> Unless it is sprinkled in struct declarations.
>
>>>  2. use a macro like MTX_ALIGN that can be SMP/UP aware and in
>>>     the future possibly change to a different compiler dependent
>>>     align attribute
>>
>> What is this macro supposed to do? I don't understand that from your
>> description.
>>
>>>  3. embed __aligned(CACHE_LINE_SIZE) into struct mtx itself so it
>>>     automatically gets aligned in all cases, even when dynamically
>>>     allocated.
>>
>> This works but I think it is overkill for structures including sleep
>> mutexes which are the vast majority. So I wouldn't certainly be in
>> favor of such a patch.
>
> This doesn't work either with fully dynamic (auto) allocations.  Stack
> alignment is generally broken (limited, and pessimized for both space
> and time) in gcc (it works better in clang).  On amd64, it is limited
> by the default of -mpreferred-stack-boundary=4.  Since 2**4 is smaller
> than the cache line size and stack alignments larger than it are broken
> in gcc, __aligned(CACHE_LINE_SIZE) never works (except accidentally,
> 16/CACHE_LINE_SIZE of the time.  On i386, we reduce the space/time
> pessimizations a little by overriding the default to
> -mpreferred-stack-boundary=2.  2**2 is even smaller than the cache
> line size.  (The pessimizations are for both space and time, since
> time and code space is wasted for the code to keep the stack aligned,
> and cache space and thus also time are wasted for padding.  Most
> functions don't benefit from more than sizeof(register_t) alignment.)

I'm not aware of stack allocated mutexes anywhere in the kernel.
Even if there is a case it's very special and unique.

I've verified that __aligned(CACHE_LINE_SIZE) on the definition of
struct mtx itself (in sys/_mutex.h) correctly aligns and pads the
global .bss resident mutexes for 64B and 128B cache line sizes.

> Dynamic allocations via malloc() get whatever alignment malloc() gives.
> This is only required to be 4 or 8 or 16 or so (the maximum for a C
> object declared in conforming C (no __align()), but malloc() usually
> gives more.  If it gives CACHE_LINE_SIZE, that is wasteful for most
> small allocations.

Stand-alone mutexes are normally not malloc'ed.  They're always
embedded into some larger structure they protect.

> __builtin_alloca() is broken in gcc-3.3.3, but works in gcc-4.2.1, at
> least on i386.  In gcc-3.3.3, it assumes that the stack is the default
> 16-byte aligned even if -mpreferred-stack-boundary=2 is in CFLAGS to
> say otherwise, and just subtracts from the stack pointer.  In gcc-4.2.1,
> it does the necessary andl of the stack pointer, but only 16-byte
> alignment.
>
> It is another bug that there sre no extensions of malloc() or alloca().
> Since malloc() is in the library and may give CACHE_LINE_SIZE but
> __builtin_alloca() is in the compiler and only gives 16, these functions
> are not even as compatible as they should be.
>
> I don't know of any mutexes allocated on the stack, but there are stack
> frames with mcontexts in them that need special alignment so they cause
> problems on i386.  They can't just be put on the stack due to the above
> bugs. They are laboriously allocated using malloc().  Since they are a
> quite large, 1 mcontext barely fits on the kernel stack, so kib didn't
> like my alloca() method for allocating them.

You lost me here.

-- 
Andre



More information about the svn-src-all mailing list