[Bug 257641] hwpmc/libpmc needs to gain a notion of big.LITTLE

From: <bugzilla-noreply_at_freebsd.org>
Date: Thu, 05 Aug 2021 17:35:35 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257641

            Bug ID: 257641
           Summary: hwpmc/libpmc needs to gain a notion of big.LITTLE
           Product: Base System
           Version: Unspecified
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: mhorne@freebsd.org

Some systems that FreeBSD supports contain a heterogeneous collection of CPUs.
This is present in ARM's big.LITTLE chips, such as the rockpro64, and will be a
feature of some next-generation x86 chips as well [1][2]. The PMC stack was
written in a time before these heterogeneous systems, and thus the assumption
of homogeneous support for performance monitoring capabilities among all cores
in the system is ingrained. This is stated explicitly in the hwpmc(4) man page
under IMPLEMENTATION NOTES.

In the case of the rockpro64/RK3399, it contains four Cortex-a53 cores and two
larger Cortex-a72 cores. There is some overlap of supported performance events
between the two types, but some events that are unique to each. This poses
problems that hwpmc is not currently equipped to deal with.

The first problem to solve is CPU reporting. There are two ways this is
communicated from the kernel to libpmc, via the kern.hwpmc.cpuid sysctl and the
PMC_OP_GETCPUINFO operation on the hwpmc syscall. Neither of these methods make
a distinction between different CPUs in the system, so the value received by
userspace basically depends on which CPU does the initialization of the hwpmc
module. This somehow needs to become a per-CPU value, in order to properly
detect which events are supported on a given core.

Assuming this is solved, the basic high-level behaviour will depend on the type
of PMC being allocated:

System-scope PMCs:
Allocating a system-scope counter with e.g. pmcstat -s <event> will attempt to
allocate the event on every CPU in the system. If the allocation fails for any
CPU, the command will not proceed with any measurement. This has reasonable
behaviour on a heterogeneous system, where the user needs to either pick an
event that is compatible with all CPUs, or use the -c flag to qualify the
selected CPUs.

Process-scope PMCs:
Allocating a process-scope counter is slightly more problematic. Suppose a PMC
counter is allocated on CPU A, where the target process is running and the
requested event is supported. If the process is migrated to CPU B, which
differs from A, then attempting to resume the hardware counter could start
measuring an entirely different event, if the programmed value is valid at all. 

I see two possible ways to solve this: don't allow PMC-enabled processes
(curproc->p_flag & P_HWPMC) to migrate outside of their PMC-compatible cluster,
OR, have libpmc call cpuset(3) for the process, and bind it to compatible CPUs
for the duration of the measurement. I have not thought through either of these
approaches in detail, but both require building some list of "PMC-compatible"
CPU groups/clusters in the kernel.



[1]
https://www.cnx-software.com/2021/07/10/intel-alder-lake-hybrid-mobile-processor-family-to-range-from-5w-to-55w-tdp/
[2]
https://www.tomshardware.com/news/amd-patent-hybrid-cpu-rival-intel-raptor-lake-cpu

-- 
You are receiving this mail because:
You are the assignee for the bug.