[Bug 265965] Intel SpeedShift (hwpstate_intel) seems to affect system stability

From: <bugzilla-noreply_at_freebsd.org>
Date: Sun, 21 Aug 2022 02:03:24 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=265965

            Bug ID: 265965
           Summary: Intel SpeedShift (hwpstate_intel) seems to affect
                    system stability
           Product: Base System
           Version: CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: henry.hu.sh@gmail.com

Recently I've upgraded my system with a Intel 12th gen i5 12600K CPU (with a
Z690 motherboard and DDR5 memories). After that, when I build world with -j16,
it often fails with all kinds of clang crashes.
The crashes happen in random places. Some examples:
* https://pastebin.com/sPi9EJ7Q
* https://pastebin.com/KtBQPzd7
The stack traces do not make much sense, other than it seems to often related
to jemalloc.

I happen to notice that the CPU frequencies shown in `sysctl dev.cpu` do not
make sense. The cpufreq seems to be using the hwpstate_intel driver. I kind of
remember that this was an old driver, and the SpeedStep (est) driver is better,
so I disabled it in driver hints. Now:
* The frequencies make sense: dev.cpu.0.freq_levels: 3701/125000 3700/125000
3500/114836 3300/106234 3100/96794 2900/88829 2700/80049 2500/72716 2200/61704
2000/55088 1800/47739 1600/41680 1400/34950 1200/29439 1000/24183 800/18310
* I no longer get crashes! I can build world with -j16 with no issue.

Thus, I think that the hwpstate_intel driver should possibly be disabled for
the newer Intel CPUs (and I think it's recommended to be disabled anyway?).

-- 
You are receiving this mail because:
You are the assignee for the bug.