Re: observations on Ryzen 5xxx (Zen 3) processors

From: Hans Petter Selasky <hps_at_selasky.org>
Date: Wed, 22 Dec 2021 12:49:16 UTC
On 12/22/21 13:42, Andriy Gapon wrote:
> 
> There have been some reports on strange / unexpected things with Ryzen 
> 5xxx processors.  I think I have seen 5950X, 5900X and 5800X mentioned, 
> not sure about others.
> 
> Since I have 5800X myself I looked into a couple of issues that have 
> straightforward demonstrators.  I would like to share my findings and 
> observations on those issues.
> 
> Issue 1.  High wake-up latency for CPU idle states.
> 
> This seems to be related to the so called CC6 idle state.
> The official information on it is very sparse.
> The state is not explicitly exposed to the OS, at least, though ACPI 
> interfaces that FreeBSD currently supports.
> 
> In my tests I see that if all logical processors enter an idle state 
> then an external interrupt can be delayed by 500+ us.  Specifically, I 
> observed this with an MSI-X interrupt from a discrete network chip.  
> Interrupts from internal components seem to be affected as well, but to 
> a lesser degree.
> 
> The deep state in question can be entered regardless of whether C2 (via 
> I/O) is enabled, C1 (via hlt) is sufficient.  In fact, with 
> machdep.idle=hlt it works the same.
> The state is not entered if at least one logical CPU is not idle.
> The state is not entered if machdep.idle=mwait is used.  Apparently, the 
> processors do not attempt to automatically enter as deep idle modes with 
> mwait as they do with hlt.
> Finally, the state is not entered if zenstates.py utility is used to 
> disable C6 / CC6 state via an undocumented (publicly) MSR.
> 
> For me personally that state does not cause any annoyances but anyone 
> who experiences problems related to "stuttering", "jitter", latency 
> might want to look into this.
> 
> Issue 2.  Uneven performance of CPU intensive tasks, especially with 
> SCHED_ULE, when SMT is enabled.
> 
> I found out that at least on my hardware all even numbered logical CPUs 
> can perform much better than odd numbered logical CPUs.  It seems that 
> hardware threads within a core are not equal.  Maybe this is related to 
> ability to use boosted frequencies, but maybe something else, I am not 
> sure.
>  From a brief look at the ULE code it looks that the selection of a hw 
> thread within a core is intentionally random when all other things are 
> equal.
> I suspect that the hardware + firmware may actually describe that 
> performance disparity via ACPI CPPC (_CPC object, etc), but right now we 
> do not support querying that or making use of it.
> 
> 
> It would interesting to see if other owners of similar processors can 
> confirm or provide counter-examples to my observations.
> 
> Simple tests for issue 1:
> - ping a host attached to the same switch (so, with very low expected 
> latency)
> - ping 127.0.0.1
> 
> For issue 2: take some CPU intensive single-threaded task and bind it 
> (with cpuset -l) to different logical CPUs.  Multiple such tasks can be 
> run concurrently on different logical CPUs.
> 
> References:
> - 
> https://forums.freebsd.org/threads/variable-ping-latency-on-ryzen-setup.82791/ 
> 
> - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=256594
> - https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254040
> - https://github.com/r4m0n/ZenStates-Linux
> - https://github.com/meowthink/ZenStates-FreeBSD --  has a bug
>    - https://github.com/avg-I/ZenStates-FreeBSD -- has a fix
> - https://www.kernel.org/doc/html/latest/admin-guide/acpi/cppc_sysfs.html
> - https://static.linaro.org/connect/lvc21/presentations/lvc21-219.pdf
> - 
> https://uefi.org/specs/ACPI/6.4/14_Platform_Communications_Channel/Platform_Comm_Channel.html 
> 
> 

Hi,

I've seen exactly the same thing. See older FreeBSD-current thread:

"AMD Ryzen 5 3400G with Radeon Vega Graphics"

I just put:

machdep.idle=spin

In /boot/loader.conf for now.

--HPS