profiling kernel modules.

Mon Dec 14 19:56:10 UTC 2009

Ryan Stone wrote:
> I find that the best way to profile the kernel is with pmc.  You don't
> need to compile anything with a special option(other than including
> the hwpmc hooks in the kernel with the HWPMC_HOOKS option) so you can
> use it at any time on the same code you'll be shipping.  pmc does
> statistical profiling; it uses whatever performance monitoring
> counters are provided by the hardware.  It has a pretty low overhead,
> especially compared with other profiling techniques.  It's really easy
> to use, too:

thanks for all this.

BTW I just tried the old kgmon/gprof profiling as a control.
it appears that on amd64 it doesn't work.  gprof can't read the file 
that the kernel puts out. (useful!).

> 
> 1) If hwpmc is not compiled into your kernel, kldload hwpmc
> 2) Run pmcstat to begin taking samples(make sure that whatever you are
> profiling is busy doing work first!):
> 
> pmcstat -S unhalted-cycles -O /tmp/samples.out
> 
> The -S option specifies what event you want to use to trigger
> sampling.  The unhalted-cycles is the best event to use if your
> hardware supports it; pmc will take a sample every 64K non-idle CPU
> cycles, which is basically equivalent to sampling based on time.  If
> the unhalted-cycles event is not supported by your hardware then the
> instructions event will probably be the next best choice(although it's
> nowhere near as good, as it will not be able to tell you, for example,
> if a particular function is very expensive because it takes a lot of
> cache misses compared to the rest of your program).  One caveat with
> the unhalted-cycles event is that time spent spinning on a spinlock or
> adaptively spinning on a MTX_DEF mutex will not be counted by this
> event, because most of the spinning time is spent executing an hlt
> instruction that idles the CPU for a short period of time.
> 
> Modern Intel and AMD CPUs offer a dizzying array of events.  They're
> mostly only useful if you suspect that a particular kind of event is
> hurting your performance and you would like to know what is causing
> those events.  For example, if you suspect that data cache misses are
> causing you problems you can take samples on cache misses.
> Unfortunately on some of the newer CPUs(namely the Core2 family,
> because that's what I'm doing most of my profiling on nowadays) I find
> it difficult to figure out just what event to use to profile based on
> cache misses.  man pmc will give you an overview of pmc, and there are
> manpages for every CPU family supported(eg man pmc.core2)
> 
> 3) After you've run pmcstat for "long enough"(a proper definition of
> long enough requires a statistician, which I most certainly am not,
> but I find that for a busy system 10 seconds is enough), Control-C it
> to stop it*.  You can use pmcstat to post-process the samples into
> human-readable text:
> 
> pmcstat -R /tmp/samples.out -G /tmp/graph.txt
> 
> The graph.txt file will show leaf functions on the left and their
> callers beneath them, indented to reflect the callchain.  It's not too
> easy to describe and I don't have sample output available right now.
> 
> 
> Another interesting tool for post-processing the samples is
> pmcannotate.  I've never actually used the tool before but it will
> annotate the program's source to show which lines are the most
> expensive.  This of course needs unstripped modules to work.  I think
> that it will also work if the GNU "debug link" is in the stripped
> module pointing to the location of the file with symbols.
> 
> 
> * Here's a tip I picked up from Joseph Koshy's blog: to collect
> samples for a fixed period of time(say 1 minute), have pmcstat run the
> sleep command:
> 
> pmcstat -S unhalted-cycles -O /tmp/samples.out sleep 60
> _______________________________________________
> freebsd-current at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"