optimization levels for 6-STABLE build{kernel,world}

Thu Sep 14 09:45:46 PDT 2006

On Sep 13, 2006, at 9:42 PM, Gary Kline wrote:
>> -funroll-loops is as likely to decrease performance for a particular
>> program as it is to help.
>
> 	Isn't the compiler intelligent enough to have a reasonable
> 	limit, N, of the loops it will unroll to ensure a faster runtime?
> 	Something much less than 1000, say; possibly less than 100.

Of course; in fact, N is probably closer to 4 or 8 than it is to 100.

> 	At least, if the initializiation and end-loop code *plus* the
> 	loop code itself were too large for the cache, my thought is that
> 	gcc would back out.

Unless you've indicated that the compiler should target a specific  
CPU architecture, there is no way for it to know whether the size of  
the L1 cache on the machine doing the compile is the same as, or even  
similar to the size of the system where the code will run.

>         I may be giving RMS too much credit; but
> 	if memory serves, thed compiler was GNU's first project.  And
> 	Stallman was into GOFAI, &c, for better/worse.[1]  Anyway, for now
> 	I'll comment out the unroll-loops arg.

cd /usr/src/contrib/gcc && grep Stallman ChangeLog

...returns no results.  A tool I wrote suggests:

% histogram.py -F'  ' -f 2,3 -p @ -c 10 ChangeLog
61 Kazu Hirata <kazu at cs.umass.edu>
51 Eric Botcazou <ebotcazou at libertysurf.fr>
48 Jan Hubicka <jh at suse.cz>
39 Richard Sandiford <rsandifo at redhat.com>
37 Alan Modra <amodra at bigpond.net.au>
30 Richard Henderson <rth at redhat.com>
29 Joseph S. Myers <jsm at polyomino.org.uk>
27 Jakub Jelinek <jakub at redhat.com>
25 Zack Weinberg <zack at codesourcery.com>
22 Mark Mitchell <mark at codesourcery.com>
20 John David Anglin <dave.anglin at nrc-cnrc.gc.ca>
20 Ulrich Weigand <uweigand at de.ibm.com>
17 Rainer Orth <ro at TechFak.Uni-Bielefeld.DE>
16 Kelley Cook <kcook at gcc.gnu.org>
16 Roger Sayle <roger at eyesopen.com>
13 David Edelsohn <edelsohn at gnu.org>
12 Aldy Hernandez <aldyh at redhat.com>
11 Stephane Carrez <stcarrez at nerim.fr>
11 Ian Lance Taylor <ian at wasabisystems.com>
10 Andrew Pinski <pinskia at physics.uc.edu>
10 Kaz Kojima <kkojima at gcc.gnu.org>
10 James E Wilson <wilson at specifixinc.com>

>> A safe optimizer must assume that an arbitrary assignment via a
>> pointer dereference can change any value in memory, which means that
>> you have to spill and reload any data being cached in CPU registers
>> around the use of the pointer, except for const's, variables declared
>> as "register", and possibly function arguments being passed via
>> registers and not on the stack (cf "register windows" on the SPARC
>> hardware, or HP/PA's calling conventions).
> 	
> 	Well, I'd added the no-strict-aliasing flag to make.conf!
> 	Pointers give me indigestion ... even after all these years.
> 	Thanks for your insights.  And the URL.

You're welcome.

> 	gary
>
> [1]. Seems to me that "good old-fashioned AI" techniques would work in
>      something like a compiler  where you probblyhave a good idea of
>      most heuristics.   -gk

Of course.  The compiler enables those optimizations with -O or -O2  
which are almost certain to result in beneficial improvements to  
performance and code size, most of the time.  Potential optimizations  
which are not helpful on average are not enabled by default, until  
the situations where they are known to be useful can be identified by  
the compiler at compile-time.

Using non-default optimization options isn't like discovering buried  
treasure that nobody else was aware of; the options aren't enabled by  
default for good reason(s), usually because the tradeoffs they make  
aren't helpful in general (yet), or because their usage has known  
bugs which result in faulty executables being produced.

-- 
-Chuck