Let's use gcc-4.2, not 4.1 -- OpenMP
Stefan Ehmann
shoesoft at gmx.net
Sat Dec 16 05:54:12 PST 2006
On Friday 15 December 2006 21:51, David O'Brien wrote:
> On Fri, Dec 15, 2006 at 07:14:53PM +0100, Stefan Ehmann wrote:
> > > CPU: AMD Athlon(TM) XP 2700+ (2166.44-MHz 686-class CPU)
>
> ..
>
> > Settings/Compiler | gcc-3.4 | gcc-4.1 | gcc-4.2
> > ----------------------------+---------+---------+---------
> > -O2 | 6.46s | 6.67s | 6.38s
> > -O2 -funroll-loops | 4.44s | 4.16s | 4.02s
> > -O2 -march=athlon-xp -fun.. | 4.39s | 4.38s | 4.26s
> > -O3 | 6.14s | 5.23s | 5.16s
> > -O3 -funroll-loops | 4.24s | 4.87s | 4.95s
> > -O3 -march=athlon-xp -fun.. | 4.19s | 4.90s | 5.07s
>
> A fine example that -O3 isn't always better than -O2.
> I wonder if you're blowing the L2 cache. IIRC, all Athlon XP 2700+
> are the Thoughbread core, which has only 256KB L2.
Yes, only 256KB L2 cache here.
Results on a pentium-m with 2MB L2 cache were quite similar. With loop
unrolling -O2 was still faster than -O3. Though not as much slower as on the
Athlon XP.
As a side note: (stripped) gcc42 binaries were up to 200% of the size of the
gcc34 binaries.
More information about the freebsd-current
mailing list