profiling with cc

Andreas Tobler toa at
Tue Feb 14 13:45:18 PST 2006

Marius Strobl wrote:

>>> It should work fine for C functions but the ENTRY macro in
>>> sparc64/include/asm.h has to be changed to call _mcount() so profiling
>>> info is also gathered for asm functions (e.g. those in libc).
>> Ok, after some reading I got confused.
>> Do you expect an ENTRY for normal operation and one for PROFILED work?
> An ENTRY_NOPROFILE which just never calls _mcount() and basically does
> what ENTRY currently does and a new ENTRY that calls _mcount() depending
> on whether GPROF is defined. I currently however don't see where/when
> GPROF would be defined for userland so this might actually be only
> relevant for kernel profiling. The latter would surprise me though.

Ok. Fine. And good, it's not just only me. Grepping through the sources 
made me wonder where all these defines are done, e.g. GPROF, GNUPROF.
And I got stuck....

>>>> Would you mind giving me some feedback if the attached is useful/needs 
>>>> rework?
>>> It has some style issues and inconsitencies (see style(9) and other
>>> inline asm in e.g. sparc64/include/cpufunc.h). Otherwise it looks
>>> good but probably needs a PIC version of the MCOUNT asm so it also
>>> works when compiling with -p.
>> Regarding inline asm, do you mean something like this:
>> #define MCOUNT ({                                               \
>>          __asm __volatile(                                       \
>>          "               .global __mcount ;              "       \
>>          "__mcount: ;                                    "       \
>>          "               add %o7, 8, %o1 ;               "       \
>>          "1:             call 2f ;                       "       \
>>          "               nop ;                           "       \
>>          "2:             add %o7, _mcount - 1b, %o2 ;    "       \
>>          "               ld [%o2], %o2 ;                 "       \
>>          "               jmpl %o2, %g0 ;                 "       \
>>          "               add %i7, 8, %o0 ; ");
>> })
> Yes, that's what I meant regarding the style of the inline asm. The
> only minor issue in the above version is that the instruction which
> is executed in the delay slot is not extra indented by one space.

Ok, I missed this one, I simply did not pay attention on delay slots and 
their handling. Heh, every bit in the existing code has its meaning....
And to be honest, I never paid attention to delay slots at all. Now I 
read the issue behind and I think I understand the basics.

>> style 9 is good to have, but I seem to run in corner cases (my 
>> impression). It would be helpful for me to point out exactly what you 
>> mean I'm doing 'wrong'.
> The other issue with your last patch was that you added macros with
> #defined<space> and changed one that way while the rest of the file
> adheres to style(9) and uses #define<tab>.

Hide myself :)

>> Going through the code shows some 
>> inconsistencies and I am confused, as said ;)
> Depends at what code you look at. F.e. the code in sys/kern and
> sys/sparc64 generally pretty much adheres to style(9) while quite
> a lot device drivers do not.

I was grepping through the whole src/sys and src/lib tree. Thanks for 
giving reliable parts where to look for good examples.
Maybe I was to generic when I said 'inconsistencies' but now I got more 
examples and I see where to go, certainly I won't catch all the details 
next time, but I hope a few more.

> Regarding the sparc64 MCOUNT macro for userland I think it would
> be better to follow the approach of the i386 version, i.e. to let
> it define a C function and only determine frompc and selfpc via
> inline asm instead of an asm function. That way we don't need to
> distinguish between the PIC and !PIC case as the compiler will
> do the necessary work for us. When compiling libc with -O2 GCC
> generates exactly the same asm we'd do in the pure asm variant
> for the !PIC case. For the PIC case the generated asm is even
> better optimized as we could do in a generic way in a pure asm
> variant.

To summarize:

- Take the sys/i386/include/profile.h as a 'template' for the mcount 
userland implementation.

- Take care of the formatting style and have a look in the sys/kern 
|sys/sparc64 code.

- Do _not_ change existing code if not necessary...

Marius, thanks a lot for you comments, I appreciate them as I am not 
used to the fbsd coding rules.


More information about the freebsd-sparc64 mailing list