Issues with powerd

Tue Mar 22 15:21:28 PST 2005

Nate,

Sorry about the BL. It's a real pain. I recently found that my home
system is on it, too, because it's in the heavily bot infested Comcast
space. I'll check and see if I can get root.org white-listed.

Just to let you know that I have not fallen off the face of the
earth. Real life has pretty much eaten up all of my time to paly with
this. I have continued to refine the existing powerd, but I think it
really needs a bit more context to work ideally.

The first issue is the difference between a system with only a couple of
speed settings and one that enables a large number of settings,
typically 8 or more.

When only a small number of settings are present, they are probably
widely spaced and all fast enough to be useful. When there are a lot of
settings, they are tightly spaced and the slowest may be too slow to
have practical use. At 150 MHz, my system is pretty useless. More
importantly, the tight spacing with the only a 10% range in utilization
makes powerd unstable. When the system drops to 150 MHz while idling,
even the slightest demand for CPU will increase the load to 20% and
trigger an increase in speed. The standard powerd code will increase CPU
speed to the maximum, the system will drop to well under 10%
utilization, the the CPU speed will start dropping. With the step so
closely spaced, it will invariably reach the point where the utilization
on even an "idle" system exceeds 20% and it pops back to maximum.

Your latest patch really only helps a bit. The sawtooth is still there,
but does move more slowly. The down-side is that it never gets very
slow, so the battery dies too quickly. It now oscillates between 1800
and 750 MHz even when nothing is happening on the system.

My last patch tried increasing the speed one "notch" at a time. This
typically resulted in the system ratcheting between 300, 225, and 150
MHz when "idle", but it did not respond as well when it got really busy
as it took several seconds (about 7) to reach full speed. Not really
acceptable.

I then tried doubling the speed every time an increase was called
for. This was better, but tended to cause the machine to spend too much
time at higher speeds than was really needed. None the less, this
algorithm combined with a 150 ms polling interval is not too bad. Even
the step at a time works pretty well at the 150 ms polling rate, though,
so I'm not sure which is best. I still have problems setting -r to 80,
though. I see things run much better if I change it to 75.

My next attempt will be to make the speed-up exponential so that the
idle (or near idle) system will not see the CPU speed bumped much. This
forces me to have a history. My first attempt (when I get a few minuted)
will be to bump up the speed by one notch the first time the CPU gets
busy, then by 2 and then by 4 and 8. This will increase the speed to max
in .75 seconds but I hope will keep it at the lowest two speeds when
the system is "idle". Then I will try setting -r back to 80 to help on
I/O issues.

There also needs to be a powercontrol capability. You really want to be
able to set a few things like force a fixed speed and adjust thresholds
on a running daemon.

When I play mp3s when traveling, I want the system to run just fast
enough that the mp3 don't break up. If the system thinks that means 80%
busy (20% idle), that's fine. This is possible by adjusting the -i and
-r values, I think. But the values used would be rather odd looking.

Also when traveling, I often want to REALLY make the battery last, so I
never want the CPU to increase about some maximum speed (probably fairly
low), so I want to set freq_max to about 900 MHz or even less. Let the
CPU rest when it's idle, but never let it use much power even when doing

The papers were interesting, but I can't help but wonder where the point
of diminishing returns will hit. The papers talk about disk drives
where spin up/down takes a VERY long time. The idea of changing CPU
clocking based on the same principles sounds reasonable, but getting code
that is tight enough to do it efficiently certainly looks to be
difficult. Even though the time to switch may be below a quantum, I know
that a significant part of the time a process frees the CPU due to I/O
or other synchronization before the end of a quantum and I suspect this
needs to be taken into account. On the other hand, a good predictive
algorithm would be a possibility, although I am not confident that one is
possible. (Then again, branch prediction seemed impossible not too long
ago.)
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: oberman at es.net			Phone: +1 510 486-8634

-------------- next part --------------

--- powerd.c.orig	Sat Feb 26 17:58:49 2005
+++ powerd.c	Tue Mar 22 14:10:10 2005
@@ -43,7 +43,7 @@
 #include <sys/sysctl.h>
 #include <sys/resource.h>
 
-#define DEFAULT_ACTIVE_PERCENT	80
+#define DEFAULT_ACTIVE_PERCENT	75
 #define DEFAULT_IDLE_PERCENT	90
 #define DEFAULT_POLL_INTERVAL	500
 
@@ -379,24 +379,28 @@
 
 		/*
 		 * If we're idle less than the active mark, jump the CPU to
-		 * its fastest speed if we're not there yet.  If we're idle
+		 * the next faster speed if we're not there yet.  If we're idle
 		 * more than the idle mark, drop down to the first setting
 		 * that is half the current speed (exponential backoff).
 		 */
 		if (idle < (total * cpu_running_mark) / 100 &&
-		    curfreq < freqs[0]) {
+			curfreq < freqs[0]) {
+		    for (i = (numfreqs - 1); i > 0; i--) {
+		      if (freqs[i] >= (curfreq * 2))
+			    break;
+		  }
 			if (vflag) {
 				printf("idle time < %d%%, increasing clock"
 				    " speed from %d MHz to %d MHz\n",
-				    cpu_running_mark, curfreq, freqs[0]);
+				    cpu_running_mark, curfreq, freqs[i]);
 			}
-			if (set_freq(freqs[0]))
+			if (set_freq(freqs[i]))
 				err(1, "error setting CPU frequency %d",
-				    freqs[0]);
+				    freqs[i]);
 		} else if (idle > (total * cpu_idle_mark) / 100 &&
 		    curfreq > freqs[numfreqs - 1]) {
 			for (i = 0; i < numfreqs - 1; i++) {
-				if (freqs[i] <= curfreq / 2)
+				if (freqs[i] < curfreq)
 					break;
 			}
 			if (vflag) {