HyperThreading makes worse to me (was Re: How to reproduce: Re:
Only 70% of theoretical peak performance on FreeBSD 8/amd64,
yanefbsd at gmail.com
Thu Apr 15 07:52:19 UTC 2010
On Wed, Apr 14, 2010 at 9:21 PM, Ian Smith <smithi at nimnet.asn.au> wrote:
> On Wed, 14 Apr 2010, Garrett Cooper wrote:
> > On Wed, Apr 14, 2010 at 7:49 PM, Garrett Cooper <yanefbsd at gmail.com> wrote:
> > > On Wed, Apr 14, 2010 at 5:46 PM, Maho NAKATA <chat95 at mac.com> wrote:
> > >> Hi Andry and Adam
> > >>
> > >> My test again. No desktop, etc. I just run dgemm.
> > >> Contrary to Adam's result, Hyper Threading makes the performance worse.
> > >> all tests are done on Core i7 920 @ 2.67GHz. (TurboBoost @2.8GHz)
> > >>
> > >> Turbo Boost off, Hyper threading off: 82% (35GFlops) 
> > >> Turbo Boost off, Hyper threading off: 72% (30.5GFlops) 
> Er, shouldn't one of those say HTT on? and/or Turbo boost on? Else
> they're both the same test as  but with different results?
There's a problem with 8.x+ cores reported by the kernel. For some odd
reason more recent Intel processors aren't reporting themselves as
HT-enabled when they have HT-cores (see: kern/145385).
I didn't look into the issue too hard, but since it does seem to be a
major performance loss perhaps I should; besides, it would be good
experience to put under my belt :].
> > >> Turbo Boost on, Hyper threading on: 71% (32GFlops) 
> > >> Turbo Boost off, Hyper threading off: 84-89% (38-40GFlops) 
> Clarification of all four possible test configs - 8 if you add pinning
> CPUs or not - might make this a bit clearer?
> > > Doesn't this make sense? Hyperthreaded cores in Intel procs still
> > > provide an incomplete set of registers as they're logical processors,
> > > so I would expect for things to be slower if they're automatically run
> > > on the SMT cores instead of the physical ones.
> Since we're talking FP, do HTT 'cores' share an FPU, or have their own?
> If contended, you'd have to expect worse (at least FP) performance, no?
Ah, that's another excellent point. What instructions is dgemm
using -- pure integer based arithmetic, floating point arithmetic,
specialized operations that would benefit from using SIMD, etc?
> > > Is there a weighting scheme to SCHED_ULE where logical processors
> > > (like the SMT variety) get a lower score than real processors do, and
> > > thus get scheduled for less intensive interrupting tasks, or maybe
> > > just don't get scheduled in high use scenarios like it would if it was
> > > a physical processor?
> > Err... wait. Didn't see that the turbo boost results didn't scale
> > linearly or align with one another until just a sec ago. Nevermind my
> > previous comment.
> Waiting for the fog to lift ..
As am I. I don't know enough in this area, but I'm definitely open
More information about the freebsd-stable