amd64 slower than i386 on identical AMD 64 system? / How is
hyperthreading handled on amd64?
peter at wemm.org
Thu Mar 16 17:17:32 UTC 2006
On Thursday 16 March 2006 02:46 am, JoaoBR wrote:
> On Wednesday 15 March 2006 18:56, Peter Wemm wrote:
> > I tend to agree with this. ubench is not a useful benchmark for
> > comparing 32 bit vs 64 bit systems.
> > However, what might be interesting is to compile a 32 bit binary
> > (and statically link it) on the i386 system, and compare the
> > runtime on the 64 bit kernel, using the same identical binary.
> > That way you are measuring the same math operations on both
> > platforms. Comparing 64 bit operations vs 32 bit operations is
> > apples vs oranges.
> > Of course, it may still be slower, but at least the results would
> > be more meaningful. Don't assume the OS is slower because the
> > compiler makes the application do twice the work.
> good point
> what do you think of unixbench since it does some real-life tasks?
In general, I don't like synthetic benchmarks at all. What we do at
work is put them under real workloads alongside a comparison system,
and measure idle cpu trends over a day or so. A comparison where one
machine has a 30% idle cpu and the other has a 40% idle cpu under the
same *real* workload tells us the most.
Unfortunately, we have some folks here that like to push the machines to
the wall. The problem is that FreeBSD 5 and later tend to not "hit the
wall gracefully" and the results of those are more often a test of how
badly the kernel suffers from lock contention than how it runs under
real load. Still, the max workload numbers are useful because it tells
you what the worst case is.
BTW: don't compare 'make buildworld' of i386 vs amd64, because amd64 not
only builds things differently, but builds all the libraries twice.
amd64 has 5 stages, i386 has 4. Even a 'make TARGET_ARCH=i386' isn't
entirely a fair comparison because one has to build a 64 bit host
compiler in one stage, the other has to build a 32 bit host compiler.
gcc even turns off some optimizations when operating as a cross
compiler. An actual 32 bit buildworld in a 32 bit chroot on both
machines is a fair comparison of buildworld times from an OS
perspective because they are building exactly the same thing. But that
doesn't make it meaningful if you're interested in 'buildworld' times
as a FreeBSD developer who does a buildworld umpteen times per day as
part of compile testing.
Anyway, one has to keep in mind whether a given test is of the operating
system port, or the cpu architecture, or application performance.
ubench in particular is stronly affected by 32 vs 64 bit because it
generates a very different workload for itself depending on the size of
There are a number of weaknesses in the amd64 port too. In particular,
the math library does not yet use the generally superior SSE2
instructions. This is a real setback because the ABI uses SSE2
floating point parameter passing. The effect is that some random libm
function is given a SSE2 register, which we convert to and x87 fp stack
register, do the x87 operation, then convert the x87 stack register
back to a SSE2 register then return the SSE2 result. This is
especially unfortunate when the native SSE2 instruction that would
operate on the SSE2 registers directly is faster. But, I don't know
SSE2 nor x87 fpu assembler code very well, so I've done "just enough"
to get things to work.
It is worth reiterating that I do NOT expect the amd64 port to be better
than i386 across the board. Nor even in most tests. But the
difference should be minimal, except in some specific cases where the
64 bit nature really helps. eg: if you want to mmap a 3GB file. You
can't do that on an i386 kernel machine. I think of the advantages of
using the amd64 port in terms of functionality rather than performance.
You definately have to consider functionality if you want a desktop
though. flash plugins for browsers are right out, for example, unless
you use the linux browser builds. Most of the time though, no flash is
usually good because you get less annoying ads. :-)
Peter Wemm - peter at wemm.org; peter at FreeBSD.org; peter at yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5
More information about the freebsd-amd64