[Bug 269228] coretemp: incorrect tjmax for desktop and server Core 2 Duo/Xeon 51xx 2 cores 65nm (Conroe, Woodcrest, possible Allen dale): 85°C, but must be 100°C

From: <bugzilla-noreply_at_freebsd.org>
Date: Sun, 29 Jan 2023 16:59:50 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269228

            Bug ID: 269228
           Summary: coretemp: incorrect tjmax for desktop and server Core
                    2 Duo/Xeon 51xx 2cores 65nm (Conroe, Woodcrest,
                    possible Allendale): 85°C, but must be 100°C
           Product: Base System
           Version: 13.1-RELEASE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: vvd@unislabs.com

FreeBSD 13.1-p amd64.
On some Core 2 Duo and "same" Xeons I found very low temperature.
For example 13°C on CPU cores in server room with 16°C air temperature, or 33°C
on desktop with 28°C in room. In same server room Core 2 Quad, Core i7 920/930,
Xeon X3430, Xeon 3104 have temperature from 24 to 50.
I started explore.

dev.cpu.X.temperature = dev.cpu.X.coretemp.tjmax - dev.cpu.X.coretemp.delta
For other Core 2 Quad Q6600 tjmax = 100, but for Core 2 Duo 85 - weird…

Check sources: https://cgit.freebsd.org/src/tree/sys/dev/coretemp/coretemp.c
        if ((cpu_model == 0xf && cpu_stepping >= 2) || cpu_model == 0xe) {
                /*
                 * On some Core 2 CPUs, there's an undocumented MSR that
                 * can tell us if Tj(max) is 100 or 85.
                 *
                 * The if-clause for CPUs having the MSR_IA32_EXT_CONFIG was
adapted
                 * from the Linux coretemp driver.
                 */
                msr = rdmsr(MSR_IA32_EXT_CONFIG);
                if (msr & (1 << 30))
                        sc->sc_tjmax = 85;
        } else if (cpu_model == 0x17) {

Open linux driver:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/hwmon/coretemp.c
                /*
                 * Now we can detect the mobile CPU using Intel provided table
                 * http://softwarecommunity.intel.com/Wiki/Mobility/720.htm
                 * For Core2 cores, check MSR 0x17, bit 28 1 = Mobile CPU
                 */
                err = rdmsr_safe_on_cpu(id, 0x17, &eax, &edx);
                if (err) {
                        dev_warn(dev,
                                 "Unable to access MSR 0x17, assuming desktop"
                                 " CPU\n");
                        usemsr_ee = 0;
                } else if (c->x86_model < 0x17 && !(eax & 0x10000000)) {
                        /*
                         * Trust bit 28 up to Penryn, I could not find any
                         * documentation on that; if you happen to know
                         * someone at Intel please ask
                         */
                        usemsr_ee = 0;
                } else {

Look like very different logic - use different MSR.

I think FreeBSD's coretemp have incorrect check of tjmax for Conroe.

I'll do more test a bit later with Linux and FreeBSD on different CPUs - Core 2
Duo E6xxx/E4xxx, Pentium Dual-Core E2xxx, and on Wolfdale Core 2 Duo E7xxx.

I want to try, but I can't promise to do this: apply Linux's logic with MSR
0x17 (MSR_IA32_PLATFORM_ID) to coretemp and test it on my hardware.

P.S. I hope my poor english won't hurt to understand me…

-- 
You are receiving this mail because:
You are the assignee for the bug.