[RFC] Patch to enable temperature ceiling in powerd

Johannes Dieterich dieterich.joh at googlemail.com
Sun Mar 2 18:59:35 UTC 2008


Hi Ian and Alex,

Ian Smith wrote:
> On Sat, 1 Mar 2008, Johannes Dieterich wrote:
>  > Hello everybody!
>  > 
>  > To get back to this discussion (sorry, normal job kicked me quite a bit
>  > last week).
>  > 
>  > Peter Jeremy wrote:
>  > > On Wed, Feb 20, 2008 at 05:06:41PM -0500, Daniel Eischen wrote:
> [..]
>  > > investigating an unrelated problem.  We eventually decided it was a
>  > > faulty sensor and a replacement board fixed it.
> 
>  > What I have now is the original hard drive (some 80 gig Fujitsu one)
>  > with a freshly installed Fedora 8 on it. I have been letting two
>  > instances of gnuchess playing against each other for a couple of hours
>  > (yes, I know... best stress test ever... ;-) ) which kept cpu usage at a
>  > nice 100 percent on both cores for all that time.
> 
> I doubt that it amounts to a buildworld, which flogs the disk pretty
> hard too, but that should serve well enough for relative comparison.
I do know that you can't compare it exactly. Although IMHO the missing
ath0 is a bigger change than the missing I/O. However...


> 
>  > proc/acpi/thermal_zones/THM1/temperature (and THM0) reported
>  > temperatures around 70 degrees, never over 72 for all that time. Lid was
>  > closed, fan worked (not very noisy even) and blew a good load of hot air
>  > out. I am tempted to say that my overheating problem is not hardware
>  > related. Only parts different were ath0 not working with Fedora and hard
>  > drive being not the 160 gig WD I am using for FreeBSD.
> 
> You could expect your 160GB drive to run a few degrees warmer, but most
> likely still inside the tz1._PSV=80C suggested most recently, however
> you haven't said whether you've yet tried applying the tz1 settings that
> Alexandre last suggested as working well for his very similar model
> Thinkpad on Feb 21st (previous message on this thread to yours)? 
Trying Alexandre's recommendations was anyway the next thing I wanted to
try. Just wanted to once more blame the hardware before... ;-)  So,
after csup'ing to 7.0-STABLE as of today and setting the new system up,
then

sysctl hw.acpi.thermal.user_override=1
sysctl hw.acpi.thermal.tz1.passive_cooling=1
sysctl hw.acpi.thermal.tz1._PSV=80C

as from Alexandre (thanks again for all your time! :-) ). I see the
following behavior when make buildworld (without any -j flags).
Temperature rises to around 72 degrees almost instantly, then to
something like 83, the fan starts working, cooling the machine down.
Frequency drops to 1000 MHz. But, however, I get the make buildworld
through almost "out-of-the-box". So far, so good! :-)


> 
>  > >>  Only under load does the temperature
>  > >> shoot up, but I know the chip isn't getting hot and the fan
>  > >> is running - I've felt around in there and nothing was even
>  > >> close to the 117+C it was sensing.
> 
> But that was on tz0, wasn't it?  Please read Alex's message carefully;
> if there's still something different about yours we likely need to know. 
The above thing with "putting the hand into the machine" is not from me.
 Would also be rather difficult with my flat notebook. ;-)


> 
>  > > Apart from the actual CPU, most parts of a system have a fairly
>  > > significant thermal mass so a rapid change in temperature either
>  > > indicates a catastrophic failure or the temperature sensor isn't
>  > > really reporting the temperature of the relevant zone.
>  > > 
>  > I totally agree with you, Peter. And either the hardware just fails
>  > under FreeBSD  (or with ath0 and the other hard drive running) OR it is
>  > a FreeBSD problem.
>  > 
>  > Everybody is invited to tell me how to stress test the system as brutal
>  > as possible to show that the problem is hardware related.
> 
> It's possible, but suspecting the hardware may have been a red herring.
> 
> It does seem more clearly related to some the recent flurry of software
> changes to acpi_thermal.c, that should detect the fact that your cpu
> thermal zone tz1 is the one needing monitoring, rather than tz0.
Would it be worth/possible finding out where exactly the problem was?
Still it is not running completely optimal IMHO (still WAY better than
for months).


Best regards and thanks again to everybody,

Johannes


More information about the freebsd-acpi mailing list