Issues with powerd

Wed Mar 9 12:21:37 PST 2005

Kevin Oberman wrote:
> I have finally had a little time to play with powerd and I didn't find
> adaptive mode worked too well for me on my T30 running a Gnome desktop.
> 
> The effect of running powerd was to move the freq in a rapid sawtooth
> dropping quickly to 150 MHz and them jumping back to 1800 and
> repeating. If the CPU was busy, the low point of the sawtooth would be
> elevated to 300or 600, but the basic pattern was constant oscillation.
> 
> I looked at the source and decided that two things were wrong.
> 
> 1. Dropping to half of the initial CPU speed was to a good choice. The
> drop is speed was too quick and it resulted in reducing the number of
> CPU performance levels to a small number instead of the 15 that should
> be available.
> 
> 2. When the CPU got busier, jumping all the way to full speed was
> excessive and would always result in the sawtooth oscillation.
> 
> I modified powerd to increase or decrease cpufreq in single steps instead
> of jumping (in either direction) and lowered DEFAULT_ACTIVE_PERCENT to
> 75. This results in a system that wanders between 150 and 450 when idle
> and climbs to full speed when under load.
> 
> I suspect this is sub-optimal, but I think it's a lot better then the
> current operation. I will try to spend some time tweaking the algorithm
> a bit to see what makes things run best (for my personal idea of "best".
> 
> Ideally, I'd like to see it stay constant an an idle or constantly
> loaded system and I suspect that I will need to add hysteresis to get
> there. 
> 
> Attached are my patch to powerd.c.

Thanks, I appreciate you looking into this.  I don't have a system with 
more than 2 levels so I'm not able to test complex behavior with my 
power strip + ampmeter.

There are some characteristics I'd like to see in adaptive mode that 
perhaps you'd be interested in considering, implementing, and testing. 
The most important problem is that the sample interval is the same as 
the reconfigure interval (i.e., no history).

I'd like to see various adaptive schemes in powerd (both predictive and 
stochastic).  See these papers for more details:
http://akebono.stanford.edu/users/nanni/research/dpm/date00yung.pdf
http://akebono.stanford.edu/users/nanni/research/dpm/VLSI00.pdf

For our default adaptive algorithm, I think sending the CPU to 100% any 
time there is work to be done is a good thing since you want to get any 
task that is present completed as soon as possible.  However, I think 
decreasing the CPU speed while idle should decay more slowly (which is 
also what you're saying).  The important thing to realize is that the 
transition latency is very low (10 us to 1 ms max) and getting lower 
with newer CPUs.  If we're going to be using a full quantum at a HZ of 
1000, we might as well do the transition to 100%.  And if we're idle for 
a couple quantums, we might as well go to the lowest value.

We currently have powerd sample the CPU busyness every 500 ms.  This is 
too large an interval when running at a low speed and a task wants to 
run (500 quantums running at the low speed).  The problem with our 
current polling approach is that we do add some overhead each cycle and 
so you don't want to be running powerd every microsecond or it will keep 
the system cpu-bound.  :)

Stepping through all 15 (or more) states when we know we're idle wastes 
power since the current model can only transition every half second.  My 
goal was to get some kind of exponential decay when idle and jump to 
100% when not.  With a finer-grained sampling interval (1-2 ms?) I think 
this might be ok.  Would you consider testing this alongside your approach?

One problem with increasing DEFAULT_ACTIVE_PERCENT to 75 is that systems 
with decent IO load don't trigger the transition to 100% speed.  Do an 
ls -laR > /dev/null on a partition and on my system, idle time hung 
around 80-85% but decreasing to a low CPU speed greatly increased the 
interrupt latency and decreased throughput.  If your benchmark is 
"buildworld", the CPU usage of gcc pushes you easily over the 75% mark 
and so you get the desired transition to full speed.

Anyway, let's try to tweak the current approach a little so that it 
mostly works.  And if you're interested in this, the real solution is to 
implement a couple different algorithms and provide them to people to 
test.  If I had to pick one, it would be exponential averaging (EA). 
See the paper above for details.

-- 
Nate