[Bug 268857] pmcstat crashes on particular event/CPU combination

From: <bugzilla-noreply_at_freebsd.org>
Date: Tue, 10 Jan 2023 14:59:49 UTC

            Bug ID: 268857
           Summary: pmcstat crashes on particular event/CPU combination
           Product: Base System
           Version: 13.1-STABLE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: bin
          Assignee: bugs@FreeBSD.org
          Reporter: jfc@mit.edu

The following command crashes on Zen CPUs but not older AMD CPUs:

$ pmcstat -P k8-ic-refill-from-l2 echo -n
initlog   0x9030000 "AMD_K8"
Segmentation fault (core dumped)

Perhaps "k8-ic-refill-from-l2" is not a valid event for Zen.  That is not
easily discoverable and should not crash the program.

lldb says

* thread #1, name = 'pmcstat', stop reason = breakpoint 1.1
    frame #0: 0x000000720ad83c02
libpmc.so.5`pmc_pmu_event_get_by_idx(cpuid=<unavailable>, idx=8350) at
   291          if ((pme = pmu_events_map_get(cpuid)) == NULL)
   292                  return (NULL);
-> 293          assert(pme->table[idx].name);
   294          return (pme->table[idx].name);
   295  }
(lldb) p pme
(const pmu_events_map *) $2 = 0x000000720af7f9f0
(lldb) p *pme
(const pmu_events_map) $3 = {
  cpuid = 0x000000720abe6054 "AuthenticAMD-23-[[:xdigit:]]+"
  version = 0x000000720ad0c2ad "v1"
  type = 0x000000720ad18386 "core"
  table = 0x000000720af70890

Array index idx=8350 is out of bounds and looking up pme->table[idx].name
causes a segfault.  I would suggest a bounds check, but I don't see any array
size field to compare against.

More specifically, pmcstat crashes on

CPU: AMD EPYC 7402P 24-Core Processor                (2794.84-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x830f10  Family=0x17  Model=0x31  Stepping=0
CPU: AMD Ryzen 5 PRO 2400GE w/ Radeon Vega Graphics  (3194.22-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x810f10  Family=0x17  Model=0x11  Stepping=0

but pmcstat does not crash on

CPU: AMD Opteron(tm) X3421 APU                       (2096.10-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x660f01  Family=0x15  Model=0x60  Stepping=1

I am reporting against 13.1-STABLE.  The bug is also present in CURRENT as of
last summer.

You are receiving this mail because:
You are the assignee for the bug.