svn commit: r242014 - head/sys/kern
Andre Oppermann
andre at freebsd.org
Thu Oct 25 16:23:52 UTC 2012
On 25.10.2012 05:49, Bruce Evans wrote:
> On Wed, 24 Oct 2012, Attilio Rao wrote:
>
>> On Wed, Oct 24, 2012 at 8:16 PM, Andre Oppermann <andre at freebsd.org> wrote:
>>> ...
>>> Let's go back and see how we can do this the sanest way. These are
>>> the options I see at the moment:
>>>
>>> 1. sprinkle __aligned(CACHE_LINE_SIZE) all over the place
>>
>> This is wrong because it doesn't give padding.
>
> Unless it is sprinkled in struct declarations.
>
>>> 2. use a macro like MTX_ALIGN that can be SMP/UP aware and in
>>> the future possibly change to a different compiler dependent
>>> align attribute
>>
>> What is this macro supposed to do? I don't understand that from your
>> description.
>>
>>> 3. embed __aligned(CACHE_LINE_SIZE) into struct mtx itself so it
>>> automatically gets aligned in all cases, even when dynamically
>>> allocated.
>>
>> This works but I think it is overkill for structures including sleep
>> mutexes which are the vast majority. So I wouldn't certainly be in
>> favor of such a patch.
>
> This doesn't work either with fully dynamic (auto) allocations. Stack
> alignment is generally broken (limited, and pessimized for both space
> and time) in gcc (it works better in clang). On amd64, it is limited
> by the default of -mpreferred-stack-boundary=4. Since 2**4 is smaller
> than the cache line size and stack alignments larger than it are broken
> in gcc, __aligned(CACHE_LINE_SIZE) never works (except accidentally,
> 16/CACHE_LINE_SIZE of the time. On i386, we reduce the space/time
> pessimizations a little by overriding the default to
> -mpreferred-stack-boundary=2. 2**2 is even smaller than the cache
> line size. (The pessimizations are for both space and time, since
> time and code space is wasted for the code to keep the stack aligned,
> and cache space and thus also time are wasted for padding. Most
> functions don't benefit from more than sizeof(register_t) alignment.)
I'm not aware of stack allocated mutexes anywhere in the kernel.
Even if there is a case it's very special and unique.
I've verified that __aligned(CACHE_LINE_SIZE) on the definition of
struct mtx itself (in sys/_mutex.h) correctly aligns and pads the
global .bss resident mutexes for 64B and 128B cache line sizes.
> Dynamic allocations via malloc() get whatever alignment malloc() gives.
> This is only required to be 4 or 8 or 16 or so (the maximum for a C
> object declared in conforming C (no __align()), but malloc() usually
> gives more. If it gives CACHE_LINE_SIZE, that is wasteful for most
> small allocations.
Stand-alone mutexes are normally not malloc'ed. They're always
embedded into some larger structure they protect.
> __builtin_alloca() is broken in gcc-3.3.3, but works in gcc-4.2.1, at
> least on i386. In gcc-3.3.3, it assumes that the stack is the default
> 16-byte aligned even if -mpreferred-stack-boundary=2 is in CFLAGS to
> say otherwise, and just subtracts from the stack pointer. In gcc-4.2.1,
> it does the necessary andl of the stack pointer, but only 16-byte
> alignment.
>
> It is another bug that there sre no extensions of malloc() or alloca().
> Since malloc() is in the library and may give CACHE_LINE_SIZE but
> __builtin_alloca() is in the compiler and only gives 16, these functions
> are not even as compatible as they should be.
>
> I don't know of any mutexes allocated on the stack, but there are stack
> frames with mcontexts in them that need special alignment so they cause
> problems on i386. They can't just be put on the stack due to the above
> bugs. They are laboriously allocated using malloc(). Since they are a
> quite large, 1 mcontext barely fits on the kernel stack, so kib didn't
> like my alloca() method for allocating them.
You lost me here.
--
Andre
More information about the svn-src-head
mailing list