How to disable acpi thermal?

Alexandre "Sunny" Kovalenko alex.kovalenko at verizon.net
Mon Jan 21 19:37:09 PST 2008


On Mon, 2008-01-21 at 17:39 -0500, Daniel Eischen wrote:
> On Mon, 21 Jan 2008, Daniel Eischen wrote:
> 
> > On Sun, 20 Jan 2008, Alexandre "Sunny" Kovalenko wrote:
> >
> >> 
> >> On Tue, 2008-01-15 at 15:34 -0500, Daniel Eischen wrote:
> >>> [ Redirected from -current ]
> >>> 
> >>> 
> >>> I posted the acpidump here:
> >>>
> >>>    http://people.freebsd.org/~deischen/stl2.iasl
> >>> 
> >>> The problem is that acpi_thermal keeps shutting down the system
> >>> after 2 minutes into a buildkernel.  The system has no load other
> >>> than the buildkernel at the time it shuts down.
> >>> 
> >>> The system is a Intel STL2 Tupelo motherboard with 1 CPU, the
> >>> other CPU socket being occupied by a CPU terminator thingy.
> >>> I uncovered the rackmount system and watched it while building
> >>> a kernel.  With the cover off the acpi monitored temperature
> >>> went to 107C and stayed there.  It only took a minute or two
> >>> to get there.  I felt around inside the chassis and nothing
> >>> was even near being to warm or hot.  With the cover on, the
> >>> temperature goes to 111/112C before being shutdown by acpi_thermal
> >>> (the limit being 110C).  There is no way anything in that
> >>> chassis is anywhere near 100C.  I've disabled acpi_thermal
> >>> for now, but it'd be nice to get a better fix.
> >>> 
> >>> Any ideas?
> >>> 
> >> Firstly, sorry for the delay in answer -- daytime job decided to kick in
> >> with the vengeance.
> >> 
> >> I took a look at the ASL and it does seem that this thing has embedded
> >> controller and that is where _TMP method gets its temperature reading
> >> from (this being conditional on the CPU present in the socket --
> >> otherwise you get 5 degrees Celsius, hardcoded in the ASL).
> >> 
> >> So the questions are:
> >> 
> >> -- does temperature in TZ2 grow over time as well? (TZ1 should stay at
> >> 5C all the time).
> >
> > No, it stays around the same.  I saw it go to 38 from 35 in
> > the same time that TZ0 went to over 110C.  I didn't see it
> > get any higher than that.
That is what bothers me more then slightly -- _TMP methods for tz0 and
tz2 (see more on tz1 below) call the same function (EGTV) with the
different first parameter. As far as I can tell (and I did mention
before that I am not an expert in the area) this value is, in turn,
populated in one of the EC registers and then values are read from other
EC registers and given back to the caller as temperature, AC0 value and
CRT value respectively. Since the call path is identical in the both
cases it is quite possible that erratic reading is coming from the
actual sensor as someone in this thread suggested. I was hoping that we
would be able to follow call trace in the debug ACPI output, but
apparently, I do not remember it that well (I was playing these games
about two years ago). I will see if I can cobble together necessary
combination of level and layer settings here before asking you to do
anything else -- I do apologize for not doing my homework properly.
> 
> One additional note, this is a dual CPU system with only one
> CPU in it, and I am not running an SMP kernel.  I was looking
> at the iasl, and noticed this for TZ0:
> 
>          ThermalZone (TZC0)
>          {
>              Method (_TMP, 0, NotSerialized)
>              {
> -->             If (LNotEqual (And (\_SB.NCPU, 0x01), 0x01))
>                  {
>                      Return (\_SB.PCI0.ISA0.EC0.TC2K (0x05))
>                  }
>                  Else
>                  {
>                      Store (\_SB.PCI0.ISA0.EC0.EGTV (0x21, 0x00), TZT0)
>                      If (LEqual (TZT0, CTC0))
>                      {
>                          Add (TZT0, 0x0A, TZT0)
>                      }
> 
>                      Return (TZT0)
>                  }
>              }
> 
> Is it possible that my configuration with only one CPU
> is confusing things?
> 
As far as I can judge from the similar code in TZC1 and stable 5C
temperature in the corresponding thermal zone (tz1), this merely checks
presence of the chip in the socket and returns stable (and bogus)
temperature when there is none. If your system is capable of running
with the CPU in socket 1 and placeholder in the socket 0, I would
suspect that your tz0 will be stuck at 5C and your tz1 will demonstrate
some dynamics.

On the slightly different note -- if you don't mind exploring another
potential dead end, I have attached patch for your ASL which fixes the
situation when _OFF method of one of the fans grabs mutex and never
releases it. You can recompile your ASL and override it on boot. No
promises though ;)

-- 
Alexandre "Sunny" Kovalenko
-------------- next part --------------
A non-text attachment was scrubbed...
Name: stl2.iasl.patch
Type: text/x-patch
Size: 688 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-acpi/attachments/20080122/5796bace/stl2.iasl.bin


More information about the freebsd-acpi mailing list