HEADSUP: cpufreq import complete, acpi_throttling changed
Nate Lawson
nate at root.org
Tue Feb 15 00:13:14 PST 2005
Kevin Oberman wrote:
>>Date: Mon, 14 Feb 2005 22:19:48 +0100
>>From: Pawel Worach <pawel.worach at telia.com>
>>
>>Hi, sorry for the delay. I bumped the number of retries to 2000 and I
>>can still repro. the error if the cpu has some load, I believe that is
>>expected. Even when "idle" (gnome desktop running) it works fine with
>>100, I think the first time I tested it I had mplayer running. I can't
>>see a real-life reason for bumping the number of retries, from all
>>speeds above 200Mhz I can step back up to 1.7Ghz without problems
>>under light cpu load. The power_profile script should probably have a
>>min limit, 75Mhz is ridiculous :)
Ok, no problem.
>>Another cool thing would be if the speed could be stepped
>>automagically based on current battery level, that would likely be the
>>job for a powerd(8).
>
>
> I've been things about this, too,
Strangely, I've thought about this for 2 years. ;-)
> and I think that stepping things down
> with battery level is not the answer. I think it MIGHT make sense to do
> so based on battery discharge rate. This would allow a user to configure
> an approximate battery "lifetime". It is especially important as
> batteries wear and, if two batteries are present, one discharges faster
> than the other.
I don't think battery level or discharge rate is a useful control input.
Think about if you have your laptop sitting on your desk for an hour,
and then want to buildworld for an hour. You certainly want the system
to power everything possible down for the first hour and run as fast as
possible the second hour. The factor almost everyone gets wrong is the
integral:
Nate Efficiency = Useful Work Done / Amp-hours burned
Total amp-hours = Sum[t: 0...Tdead](PowerUsed(t))
This formula explains why you should run the CPU at full speed whenever
there is work to be done (increasing the numerator) and run it as low as
possible when idle (decreasing the loss of the denominator). The
denominator is ultimately fixed (you only have so much battery).
On AC power, the denominator is infinite (meaning Nate is never very
efficient). In this case, thermal (and hence noise) issues become an
issue. You also want to keep your power bill low so conserving AC power
is a minor but valid concern (think: server farm.)
I think the control inputs should be current system load and thermal
level. The system load control function would take the current
instantaneous load, all previous measured loads, and current CPU freq
level and output its desired frequency. The thermal function would take
into account temperatures of each zone, current active coolers, desired
temperature, current CPU freq, and output its desired frequency. There
would be a weighting value that would allow the user to select which
factor dominates the decision. All this is a good research project.
Don't forget to throw in sleep states (i.e. S1 or S3) and disk spindown
if you want to be complete. The Linux "laptop mode" patches are a good
example how to do some of this right, as well as iBook behavior.
> The other issue is thermal. I would assume that the frequency should be
> decreased when _PSV is reached, but should it continue to drop the
> frequency until the temperature stabilizes or until it drops to _PSV. I
> believe the latter is a better choice, especially as the effect is not
> quite instantaneous, and since it is only read by ACPI at fairly long
> intervals. This means that the adjustment should not be too aggressive
> to prevent continual oscillation of the frequency and temperature.
>
> And when do you start increasing the frequency again when temperature
> drops? Once again, you want to reach a thermal stability and not
> oscillate around _PSV (or at least do so slowly. As there is probably
> substantial variation between systems, so a settable hysteresis is
> probably needed for really good results. (This gets worse for system
> which don't support both throttling and frequencies.)
_PSV should be implemented in acpi_thermal. It would control the
frequency through CPUFREQ_GET/SET. That's one main reason why I added
both user and kernel interfaces. acpi_thermal doesn't have to know what
cpufreq devices are on the system, it just can use whatever is there.
If you read the section of the ACPI spec on _PSV, you'll see it offers
an equation and methods for the BIOS to signal the appropriate
coefficients for getting good hysteresis.
> And, should TCC be folded into the equation for P4 systems? After all,
> that's what it's for. I dont; see any way to set TCC to automatic at the
> moment, but that could be a significant tool in thermal stability.
> (There may be a way, but I didn't see it in the sources.)
I'm soon going to move p4tcc to be another relative cpufreq driver. It
will be under manual control although the driver is free to implement
some hidden ultimate limit via automatic control to keep the chip from
melting. I think TCC already has that non-configurable feature in hw no
matter what we do.
Whatever the case, I think optional cpufreq management (i.e. powerd)
should be done in usermode. This allows it to make complex decisions
and link with lots of components (want to coordinate with a cluster over
the network? sure!) If it crashes, the system just uses more power or
is slow until a user restarts it. However, thermal or other emergency
uses of cpufreq should be in the kernel and use the higher priorities so
that the system doesn't melt down when a fan dies.
--
Nate
More information about the freebsd-acpi
mailing list