numbers don't lie ...

Wed Sep 13 14:12:41 PDT 2006

--- Mike Meyer <mwm at mired.org> wrote:
> > i.e. since the hyperthreading virtual CPUs are not
> actually real CPUs,
> > they spend a lot of time blocked in the same CPU
> core waiting for
> > another hyperthread to release a resource, so the
> threads are both
> > "running" from the point of view of the OS, but
> one is doing no work
> > on the CPU a lot of the time.
> 
> In other words, hyperthreading makes the measurement
> FreeBSD takes to
> see how much cpu time is being used nearly
> meaningless. I hadn't
> realized that.
> 
> My understanding was that hyperthreading was
> intended to let the
> system make more efficient use of the CPU, by
> providing two
> instruction streams to be scheduled in the pipeline.
> This means you
> get fewer bubbles in the pipeline, resulting in more
> work getting done
> in the same number of cycles. The hyperthreads don't
> lock resources
> per se, but there are lots of screwy rules about
> when things can be
> put in the pipeline, leading to the same result - a
> hyperthread will
> "wait" some number of steps in the pipeline for the
> rules to allow the
> hyperthreads next step to happen. Later
> implementations of
> hyperthreading relax the rules, meaning you get less
> waiting, or more
> efficient use of the cpu, depending on how you want
> to look at it.

Yes, but the net result is that when two instructions
are scheduled on the CPU which require the same CPU
execution unit or other CPU resource, they will
serialize on the chip, and from the point of view of
the OS one instruction will take twice as long to
execute as the other.

There's nothing the OS can really do to avoid charging
the process for the extra time since the OS doesn't
know that the CPU blocked one of the instructions it
was told to execute.

Anyway, there are other conditions where the CPU will
stall a "running" instruction leading to "extra" time
charged to the process when the CPU was doing no work
(e.g. cache misses), so this is just one more thing to
understand about CPU performance and what affects it. 
If you really want to tune your application precisely
then you need to delve into the statistics counters
provided by the CPU (see e.g. pmcstat(8)).

Note that you can still use the existing time
accounting behaviour to tune your application for
performance (or evaluate how non-optimal it is) on a
HTT CPU, e.g. by comparing numbers with/without HTT.

> > This means that hyperthreading may or may not
> increase your
> > performance depending on your workload (in your
> case it does).
> 
> Which is why I checked. This behavior isn't really
> different from any
> other multi-CPU system: enabling another processor
> may or may not
> increase your performance depending on your
> workload. In particular,
> if some shared resource is the critical one for your
> workload and you
> don't get more of it by turning on the second CPU,
> turning on the
> second CPU will not improve performance, and will
> probably hurt it by
> adding overhead. Hyperthreading is the worst case I
> know of, because
> those CPUs share the CPU core.

Yep, not different in quality, but in degree; if you
compare a dual core system to a hyperthreading system,
all other things being equal a dual core system will
win hands down.

The key to remember is that while hyperthreading
presents itself to the OS as being two independent
virtual CPUs, there are worst case scenarios where
it's no better than a single CPU and in practise is
worse than a single CPU because of the extra overhead
required in a SMP kernel.

Kris

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com