cvs commit: src/sys/kern init_main.c kern_malloc.c md5c.c subr_autoconf.c subr_mbuf.c subr_prf.c tty_subr.c vfs_cluster.c vfs_subr.c

Peter Wemm peter at wemm.org
Tue Jul 22 14:30:30 PDT 2003


"Alan L. Cox" wrote:
> Marcel Moolenaar wrote:
> > 
> > On Tue, Jul 22, 2003 at 01:54:16PM -0500, Alan L. Cox wrote:
> > > >
> > > > `-finline-limit=N'
> > > >      By default, gcc limits the size of functions that can be inlined.
> > > >      This flag allows the control of this limit for functions that are
> > > >      explicitly marked as inline (i.e., marked with the inline keyword
> > > >      or defined within the class definition in c++).  N is the size of
> > > >      functions that can be inlined in number of pseudo instructions
> > > >      (not counting parameter handling).  The default value of N is 600.
> > > >      Increasing this value can result in more inlined code at the cost
> > > >      of compilation time and memory consumption.  Decreasing usually
> > > >
> > >
> > > There is another way.  The following example illustrates its use.
> > >
> > > static int    vm_object_backing_scan(vm_object_t object, int op)
> > > __attribute__((always_inline));
> > 
> > I hope we can come up with a scheme that allows us to control
> > inlining on a per-platform basis. Current events demonstrate
> > pretty good how people treat optimizations (which inlining is)
> > as machine independent fodder and how easy it is to generalize
> > beyond sensibility.
> > Unfortunately, the use of an expression-like syntax (inline or
> > __attribute__ keyword) makes this harder than with a statement-
> > like syntax (like #pragma), because of the 2-D space (platforms
> > vs functions).
> > 
> 
> I chose my example very carefully...
> 
> In the case of vm_object_backing_scan(), I could argue that "always
> inline" is correct regardless of platform.  This function was written
> with inlining as an expectation.  It looks something like this:
> 
> vm_object_backing_scan(..., int op)
> {
>   ...
>   if (op == "constant #1")
>     ...
>   else if (op == "constant #2")
>     ...
> 
> Furthermore, all call sites pass a constant as the value for op. 
> Consequently, if the code is inlined, all but the relevent case are
> removed as dead code.
> 
> I also recall this idiom being used in the i386 pmap.
> 
> I suspect that gcc fails to inline this code because it makes the inline
> vs. no-inline decision before it does dead code elimination.

Yes, your suspicion is correct.  It does its estimation before any
optimization, dead code elimination, etc.  Alexander mentioned the exact
way it does it, but its something like "instruction_estimate = (number of C
keywords + some parser stuff) * 10".. so all the do { } while (0)   stuff
counts as ~100 instructions in the estimate.  A KOBJ call is estimated at
(I think Alexander mentioned) 571 instructions instead of a couple.

The bad thing here is that since ~gcc-3.1, all these inlines have been
silently turned off without warning.  This might explain some of the stack
issues with kobj/newbus/etc on the expensive function call architectures.

Cheers,
-Peter
--
Peter Wemm - peter at wemm.org; peter at FreeBSD.org; peter at yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5



More information about the cvs-src mailing list