FreeBSD shutting down unexpectedly

Yuri Lukin freebsd at swaggi.com
Thu Mar 23 16:42:26 UTC 2006


soralx at cydem.org wrote ..
> 
> However, if you measure the temperature with the case open and CPU idle,
> and cooler's performance is same or better than assumed, you'd better
> not rely on this processor. In fact, 55*C is somewhat too high in any
> case, considering that there exists additional heat dissipation path
> through mainboard.

I checked mbmon as well as BIOS and both reported a drop in at least 10 degrees
with the case open and CPU running idle. 
 
> I'd check the thermal interface btw CPU and cooler first. Is the heatsink
> sitting level on the core? Is there a nice thin layer of clean thermal
> compound between them? Fan turning at good RPM? Then I'd check Vdd with
> a scope (or at least a DMM). Is it at the right level and clean? At this
> point I would think twice before replacing the CPU. Overheating it could
> have created some kind permanent latchups (shorts from Vdd to Vss directly),
> which would result in higher power consumption, but this isn't likely,
> plus
> you'd definitely see some instability or erros in CPU operation. So
> I personally don't think that CPU damaged by overheating can consume
> more power, but be stable, and then suddenly die some day; correct
> me if I'm wrong. 

When I ordered a replacement fan, I also ordered replacement heatsinks
(this is a dual-cpu motherboard). So I discarded old heatsinks and installed
new fan/heatsink combo's and also applied a drop of Arctic Silver to each cpu
(after cleaning off the old thermal grease with isopropyl alcohol). 
 
> It is not very likely that you CPU was damaged by overheating too. It might
> not have been stable when overheated (no kidding!), but I belive the
> mainboard should power it down before it reaches temperature at which
> permanent damage results.

Agreed, I believe this is exactly what was happening with cpu1 when the fan seized. 
Unfortunately for me, I did not have SMP compiled into the kernel so the system
would just shut off. 

I am still however a bit confused as to what mbmon is outputting for me. This
is what I am currently seeing:

Temp.= 30.8, 28.6, 22.0; Rot.= 5818, 5113,    0
Vcore = 1.50, 1.50; Volt. = 3.35, 3.27,  7.93,   0.00,  0.00

I am assuming that 30.8 is the Tcpu of cpu1. But which one is the Tcpu of cpu2?
Here's the chipset mbmon is using to probe the values:

su-2.05b# mbmon -D
Probe Request: none
>>> Testing Reg's at VIA686 HWM <<<
Probing VIA686A/B chip:
  CR40:0x01,  CR41:0xD0,  CR42:0x9C,  CR43:0xFF
  CR44:0xFF,  CR47:0xF0,  CR49:0x7D,  CR4B:0x40
  CR3F:0xA2,  CR14:0x7E,  CR1F:0x7F,  CR20:0x93
  CR21:0x8E,  CR22:0x79,  CR23:0x79,  CR24:0xCE
  CR25:0x7F,  CR26:0x7F,  CR29:0x1D,  CR2B:0xFF
Using VIA686 HWM directly!!
* VIA Chip VT82C686A/B found.

I read the doc for mbmon but still couldn't really understand it. Do I need
to recompile the kernel with SMP in order for mbmon to read the values from
the second CPU? I didn't think that would be necessary. By the way, before anyone
asks, I do plan to compile SMP in the near future to utilize the second processor. 

Thanks.
-Yuri



More information about the freebsd-hardware mailing list