Release Building and /etc/make.conf
bde at zeta.org.au
Fri Jan 23 04:34:47 PST 2004
On Wed, 21 Jan 2004, John Baldwin wrote:
> On Tuesday 20 January 2004 11:58 pm, Bruce Evans wrote:
> > i386 (or equivalently, no special tuning) is the best default, at least
> > in non-FPU-intensive applications. In my integer crunching application/
> > benchmark (searching a game tree), it even gives better results than
> > -mcpu=pentiumpro on a pentiumpro class machine (a 366MHz Celeron).
> > -mcpu=athlon-xp gives even better results.
> > All with -O3 -fomit-frame-pointer
> > -mcpu-athlon-xp 48.42 real 47.31 user 0.41 sys
> > 51.22 real 50.10 user 0.30 sys
> > -mcpu=i386 51.98 real 50.18 user 0.34 sys
> > -mcpu=pentiumpro 56.38 real 55.26 user 0.34 sys
> > -mcpu=pentium2 56.24 real 55.25 user 0.36 sys
> > -mcpu=pentium3 56.59 real 55.25 user 0.40 sys
> > -mcpu=pentium4 58.52 real 56.96 user 0.36 sys
> > -mcpu=i486 79.17 real 77.69 user 0.32 sys
> > -mcpu=i586 74.80 real 73.07 user 0.48 sys
> > This is just one benchmark, chosen for its potential optimizability.
> > I only did non-exhaustive benchmarks for the makeworld benchmark. I
> > removed the -mpentiumpro change when I saw the kernel size bloat that
> > it gave.
> Does -mcpu=althon-xp perform worse than the default in other benchmarks that
> you've run?
I haven't run enough to be sure. It's hard to test all the combinations for
long enough. Some quick tests with the cc1 application/benchmark:
cc1 compiled with -O3 -fomit-frame-pointer, and:
-mcpu=i386 (code o3)
Times for the "all" part of "make obj; make depend; make all" starting
with an empty object tree and source tree = src/bin on the Celeron and
src/usr.sbin on the Athlon (it doesn't complete because it wants to
link to never-installed unbuilt libraries, but it gets a fair way).
Smallest real time for 2 runs:
On a Celeron 400 with source tree src/bin:
o3: 121.94 real 97.14 user 19.94 sys
o4: 130.83 real 106.59 user 19.07 sys
oa: 122.69 real 97.58 user 19.39 sys
op: 124.01 real 99.54 user 19.56 sys
All non-null -mcpu settings are pessimizations, with -mcpu=i486
significantly bad and -mcpu=pentiumpro probably significantly bad.
Optimizing the pentiumpro class machine as an athlon-xp works
better (less worse here) than optimizing it as a pentiumpro in this
benchmark too, but the differences are smaller
On an Athlon-XP1600 overclocked with source tree src/usr.sbin:
o3: 67.62 real 57.46 user 9.53 sys
o4: 69.09 real 57.65 user 10.20 sys
oa: 67.53 real 56.78 user 9.62 sys
op: 68.14 real 57.47 user 9.70 sys
Most of the differences are too small to be significant. Optimizing
the athlon-xp as an athlon-xp at least doesn't pessimize it.
My integer-crunching benchmark shows similarly small differences on
freefall, but that may be just because freefall's gcc is so old.
> > > > Note that CPUTYPE has worse bugs for i386's. Setting it to a supported
> > > > CPU gives -march instead of -mcpu, so using it gives unportable
> > > > binaries, and bsd.cpu.mk provides no way to get the corresponding -mcpu
> > > > settings. OTOH, CPUTYPE for alphas gives only -mcpu.
> > >
> > > That is by design. Note that on all non-i386 architectures such as
> > > alpha, etc. -mcpu means the same thing as -march. The other
> > > architectures use -mtune to get the same effect as -mcpu on i386.
> > Doesn't make it any less of a bug.
> The intent of CPUTYPE is that you can have ports and world optimized for the
> specific machine you are compiling on, it is not set to anything by default,
> so the user only gets -march=foo if they explicitly ask for it. I fail to
> see how that is a bug.
It is a bug because it implements the least useful option set first.
More information about the freebsd-current