amd64 slower than i386 on identical AMD 64 system? / How is hyperthreading handled on amd64?

Peter Wemm peter at wemm.org
Wed Mar 15 21:57:48 UTC 2006


On Tuesday 14 March 2006 03:20 pm, Coleman Kane wrote:
> On 3/14/06, JoaoBR <joao at matik.com.br> wrote:
> > On Tuesday 14 March 2006 07:06, Alexander Konovalenko wrote:
> > > > Hi
> > > > Since some time (>6.0R) I have the impression that amd64 runs
> > > > slower
> >
> > than
> >
> > > > i386. Now I run some tests on identical hardware and using
> > > > ubench confirmes this. Somebody has comments on this?
> > >
> > > I have Dual core AMD64 4400+ and FreeBSD RELENG_5. I don't have
> > > FreeBSD i386 installed but you can just compare benchmarks.
> > >
> > > ubench uses all CPU/cores by default, when one ubench is running,
> > > top shows:
> >
> > so where is your comparism? My point was that the same hardware is
> > faster running i386
> >
> > I experience this also on X2 machines but do not have two machines
> > to compare
> > I have a X2-4400-SMP running amd64 and a X2-4200-SMP running i386
> > and it gives
> > me the same numbers running ubench
> >
> >
> >
> > João
> >
> > >  PID USERNAME   PRI NICE   SIZE    RES STATE  C   TIME   WCPU   
> > > CPU COMMAND 11528 XXXX       111    0  3572K   880K RUN    1  
> > > 0:12 93.64% 42.29% ubench 11529 XXXX       111    0  3572K   880K
> > > CPU0   1   0:11 97.21% 41.16% ubench 11526 XXXX        -8    0 
> > > 3572K   880K piperd 0 0:17 41.76% 31.98% ubench
> > >
> > >
> > > one ubench executed (with no -s flag = use all CPU, default):
> > >
> > > Unix Benchmark Utility v.0.3
> > > Copyright (C) July, 1999 PhysTech, Inc.
> > > Author: Sergei Viznyuk <sv at phystech.com>
> > > http://www.phystech.com/download/ubench.html
> > > FreeBSD 5.5-PRERELEASE FreeBSD 5.5-PRERELEASE #12: Sun Mar  5
> > > 17:34:07
> >
> > CET
> >
> > > 2006     XXXX at XXXX:/usr/obj/usr/src/sys/DAEMON64SMP amd64
> > > Ubench CPU:   238149
> > > Ubench MEM:   255459
> > > --------------------
> > > Ubench AVG:   246804
> > >
> > >
> > > two ubench executed with -s flag (use single CPU only):
> > >
> > > Ubench Single CPU:   120184 (0.40s)
> > > Ubench Single MEM:   126787 (0.39s)
> > > -----------------------------------
> > > Ubench Single AVG:   123485
> > >
> > > Ubench Single CPU:   121000 (0.41s)
> > > Ubench Single MEM:   128762 (0.40s)
> > > -----------------------------------
> > > Ubench Single AVG:   124881
> > >
> > >
> > > one ubench executed with -s flag (use single CPU only):
> > >
> > > Ubench Single CPU:   123251 (0.40s)
> > > Ubench Single MEM:   161494 (0.40s)
> > > -----------------------------------
> > > Ubench Single AVG:   142372
> > >
> > >
> > > /Alexander Konovalenko
> > >
> > > +46-8-5537-8142 (office)
> > > +46-7-3752-2116
> > > http://daemon.nanophys.kth.se/~kono
> > >
> > > Royal Institute of Technology (KTH)
> > > Nanostructure Physics Department, Albanova
> > > Roslagstullsbacken 21
> > > 10691 Stockholm
> > > Sweden
> >
> > A mensagem foi scaneada pelo sistema de e-mail e pode ser
> > considerada segura.
> > Service fornecido pelo Datacenter Matik 
> > https://datacenter.matik.com.br
> > _______________________________________________
> > freebsd-amd64 at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-amd64
> > To unsubscribe, send any mail to
> > "freebsd-amd64-unsubscribe at freebsd.org"
>
> I think that the nature of the ubench benchmark should be
> investigated to reveal the reasons behind your dismay. It seems to me
> that your assumption that 64-bit should be faster than 32-bit in all
> cases is wrong. The nature of the processor design, the OS
> implementation, and how ubench does its measurement needs to be
> addressed.
>
> First of all, when comparing a 64-bit amd64 to a 32-bit IA-32 system
> it is important to know that this *does not* in fact mean that if you
> tested a loop of:
> long x, y, z;
> x = 1;
> y = 1;
> z = x + y;
>
> That the 64-bit machine would do 2X that above calculation. In fact,
> on the 64-bit machine, the memory taken up by the x, y, z would be
> double that on the i386, the add/load instruction would also double
> in size, and as far as execution goes, the time *should* be about the
> same for both units. This is all looking like 64-bit would, by its
> nature, have a slower average than your 32-bit system.
>
> In addition, amd64 64-bit mode doubles your register set, increasing
> the amount of memory that needs to be moved around on a context
> switch, and everything is pointing towards.....probably slower.

I tend to agree with this.  ubench is not a useful benchmark for 
comparing 32 bit vs 64 bit systems.

However, what might be interesting is to compile a 32 bit binary (and 
statically link it) on the i386 system, and compare the runtime on the 
64 bit kernel, using the same identical binary.  That way you are 
measuring the same math operations on both platforms.  Comparing 64 bit 
operations vs 32 bit operations is apples vs oranges.

Of course, it may still be slower, but at least the results would be 
more meaningful.  Don't assume the OS is slower because the compiler 
makes the application do twice the work.

-- 
Peter Wemm - peter at wemm.org; peter at FreeBSD.org; peter at yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5


More information about the freebsd-amd64 mailing list