kern/121660: hwpmc(4) incorrectly handles PMC sampling events from AMD

Adrian Chadd adrian at FreeBSD.org
Thu Mar 13 04:00:08 UTC 2008


>Number:         121660
>Category:       kern
>Synopsis:       hwpmc(4) incorrectly handles PMC sampling events from AMD
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Mar 13 04:00:07 UTC 2008
>Closed-Date:
>Last-Modified:
>Originator:     Adrian Chadd
>Release:        FreeBSD 8.0-CURRENT i386
>Organization:
FreeBSD
>Environment:
System: FreeBSD jacinta.home.cacheboy.net 8.0-CURRENT FreeBSD 8.0-CURRENT #5: Sun Mar 9 19:34:11 UTC 2008 adrian at jacinta.home.cacheboy.net:/data/1/obj/usr/src/sys/JACINTA i386

Copyright (c) 1992-2008 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.0-CURRENT #5: Sun Mar  9 19:34:11 UTC 2008
    adrian at jacinta.home.cacheboy.net:/data/1/obj/usr/src/sys/JACINTA
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) XP 1800+ (1540.49-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x681  Stepping = 1
  Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
  AMD Features=0xc0400800<SYSCALL,MMX+,3DNow!+,3DNow!>
real memory  = 2147418112 (2047 MB)
avail memory = 2093957120 (1996 MB)
MPTable: <OEM00000 PROD00000000>
ioapic0: Assuming intbase of 0
ioapic0 <Version 0.3> irqs 0-23 on motherboard
kbd1 at kbdmux0
ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
cpu0 on motherboard
pcib0: <MPTable Host-PCI bridge> pcibus 0 on motherboard
pci0: <PCI bus> on pcib0
agp0: <VIA 8377 (Apollo KT400/KT400A/KT600) host to PCI bridge> on hostb0
agp0: aperture size is 256M
pcib1: <MPTable PCI-PCI bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
vgapci0: <VGA-compatible display> mem 0xd8000000-0xd9ffffff,0xda000000-0xda003fff,0xdb000000-0xdb7fffff irq 16 at device 0.0 on pci1
em0: <Intel(R) PRO/1000 Network Connection 6.8.4> port 0xd000-0xd03f mem 0xde020000-0xde03ffff,0xde000000-0xde01ffff irq 19 at device 11.0 on pci0
em0: [FILTER]
em0: Ethernet address: 00:0e:0c:b9:4c:f9
uhci0: <VIA 83C572 USB controller> port 0xd400-0xd41f irq 21 at device 16.0 on pci0
uhci0: [GIANT-LOCKED]
uhci0: [ITHREAD]
usb0: <VIA 83C572 USB controller> on uhci0
usb0: USB revision 1.0
uhub0: <VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0
uhub0: 2 ports with 2 removable, self powered
uhci1: <VIA 83C572 USB controller> port 0xd800-0xd81f irq 21 at device 16.1 on pci0
uhci1: [GIANT-LOCKED]
uhci1: [ITHREAD]
usb1: <VIA 83C572 USB controller> on uhci1
usb1: USB revision 1.0
uhub1: <VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1
uhub1: 2 ports with 2 removable, self powered
uhci2: <VIA 83C572 USB controller> port 0xdc00-0xdc1f irq 21 at device 16.2 on pci0
uhci2: [GIANT-LOCKED]
uhci2: [ITHREAD]
usb2: <VIA 83C572 USB controller> on uhci2
usb2: USB revision 1.0
uhub2: <VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb2
uhub2: 2 ports with 2 removable, self powered
ehci0: <VIA VT6202 USB 2.0 controller> mem 0xde040000-0xde0400ff irq 19 at device 16.3 on pci0
ehci0: [GIANT-LOCKED]
ehci0: [ITHREAD]
usb3: EHCI version 1.0
usb3: companion controllers, 2 ports each: usb0 usb1 usb2
usb3: <VIA VT6202 USB 2.0 controller> on ehci0
usb3: USB revision 2.0
uhub3: <VIA EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb3
uhub3: 6 ports with 6 removable, self powered
isab0: <PCI-ISA bridge> at device 17.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <VIA 8235 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe000-0xe00f at device 17.1 on pci0
ata0: <ATA channel 0> on atapci0
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci0
ata1: [ITHREAD]
pci0: <multimedia, audio> at device 17.5 (no driver attached)
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff pnpid ORM0000 on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
fdc0: <Enhanced floppy controller> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: [FILTER]
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
ppbus0: <Parallel port bus> on ppc0
ppbus0: [ITHREAD]
plip0: <PLIP network interface> on ppbus0
plip0: WARNING: using obsoleted IFF_NEEDSGIANT flag
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
ppc0: [GIANT-LOCKED]
ppc0: [ITHREAD]
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio0: [FILTER]
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
sio1: [FILTER]
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
unknown: <PNP0303> can't assign resources (port)
unknown: <PNP0c01> can't assign resources (memory)
unknown: <PNP0501> can't assign resources (port)
unknown: <PNP0700> can't assign resources (port)
unknown: <PNP0400> can't assign resources (port)
unknown: <PNP0501> can't assign resources (port)
Timecounter "TSC" frequency 1540490006 Hz quality 800
Timecounters tick every 1.000 msec
ad0: 38166MB <Seagate ST340016A 3.10> at ata0-master UDMA100
Trying to mount root from ufs:/dev/ad0s1a
WARNING: / was not properly dismounted
WARNING: /data/1 was not properly dismounted
WARNING: /home was not properly dismounted
WARNING: /var was not properly dismounted
ipfw2 (+ipv6) initialized, divert loadable, nat loadable, rule-based forwarding disabled, default to deny, logging disabled

>Description:

hwpmc returns nothing from a user-mode sample (pmcstat -P instructions -O sample.out -t <pid>)

The hwpmc registers are 48 bit on at least the Athlon XP platform
(and the code sets the width value in the class for all AMD CPUs to
48, not 64). The sampling is done by counting upwards until the
counter loops and generates an interrupt. The code uses a 2s compliment
trick to turn the sample period counter into a counter number
useful for generating an NMI. It does this on a 64 bit value but
as the counters are 48 bit, it will read back a 48 bit value with
the high 16 bits set to 0 and the virtual PMC stuff quickly loses
track.

My email to -current has more info:

Between my Athlon XP box giving me no useful pmc stats and my new Core
2 duo box not even working with pmc, I decided to poke at the Athlon
XP support a bit to see if I could figure out what was going on.

It seems that at least my revision of the Athlon XP has 48 bit
performance counters (AMD Athlon Processor x86 Code Optimisation
Guide, page 235 (Performance-Monitoring Counters: Overview) and the
top 16 bits read back 0x0000.

Since the code is taking the 2's compliment of the stored PMC value
(which is so the value is incremented to 0xffffffffffffffff and wraps
over, generating an NMI - mentioned on page 240), negating the value
gives humerous results:

(Note: some of these are my own debugging information.)

Mar  9 16:09:43 jacinta kernel: hwpmc: TSC/1/0x20<REA>
K7/4/0x1ff<INT,USR,SYS,EDG,THR,REA,WRI,INV,QUA>
Mar  9 16:10:02 jacinta kernel: MDP:SWO:1: pc=0xc5814180 pp=0 enable-msr=0
Mar  9 16:10:02 jacinta kernel: local initial: ri 1: 65536
Mar  9 16:10:02 jacinta kernel: MDP:SWO:1: pc=0xc5814180 pp=0xc576a780
enable-msr=0
Mar  9 16:10:02 jacinta kernel: csw_in: ri 1; pmcval 65536
Mar  9 16:10:02 jacinta kernel: MDP:WRI:1: amd-write cpu=0 ri=1
v=ffffffffffff0000
Mar  9 16:10:02 jacinta kernel: MDP:SWI:1: pc=0xc5814180 pp=0xc576a780
enable-msr=0
Mar  9 16:10:02 jacinta kernel: MDP:REA:1: amd-read id=1 class=1
Mar  9 16:10:02 jacinta kernel: MDP:REA:2: amd-read id=1 -> ffff00000000ff01
Mar  9 16:10:02 jacinta kernel: read: ffff00000000ff01; saved 10000;
diff -281474976710911
Mar  9 16:10:02 jacinta kernel: csw_out: ri 1: pp_pmcval 65536..
Mar  9 16:10:02 jacinta kernel: csw_out: ... ri 1: pp_pmcval now
281474976710911..
Mar  9 16:10:02 jacinta kernel: MDP:SWO:1: pc=0xc5814180 pp=0xc576a780
enable-msr=0
Mar  9 16:10:02 jacinta kernel: csw_in: ri 1; pmcval 281474976710911
Mar  9 16:10:02 jacinta kernel: MDP:WRI:1: amd-write cpu=0 ri=1
v=fffeffffffffff01
Mar  9 16:10:02 jacinta kernel: MDP:SWI:1: pc=0xc5814180 pp=0xc576a780
enable-msr=0
Mar  9 16:10:02 jacinta kernel: MDP:REA:1: amd-read id=1 class=1
Mar  9 16:10:02 jacinta kernel: MDP:REA:2: amd-read id=1 -> ffff00000000f47f
Mar  9 16:10:02 jacinta kernel: read: ffff00000000f47f; saved
10000000000ff; diff -562949953358976
Mar  9 16:10:02 jacinta kernel: csw_out: ri 1: pp_pmcval 281474976710911..
Mar  9 16:10:02 jacinta kernel: csw_out: ... ri 1: pp_pmcval now
844424930004351..

>How-To-Repeat:

pmcstat -P instructions -O sample.out -t pid

>Fix:

This attempts to "pretend" to be the expected value - and it begins
recording sample events in the above test - but I don't believe its
correct. If the value rolls over somehow then we'll be OR'ing in
high bits inappropriately.

I think it should be a sign-extend rather than my OR operation.

Index: hwpmc_amd.c
===================================================================
RCS file: /share/FreeBSD/cvsrepo/src/sys/dev/hwpmc/hwpmc_amd.c,v
retrieving revision 1.14
diff -u -r1.14 hwpmc_amd.c
--- hwpmc_amd.c	7 Dec 2007 08:20:15 -0000	1.14
+++ hwpmc_amd.c	10 Mar 2008 12:06:49 -0000
@@ -303,7 +303,12 @@
 
 	tmp = rdmsr(pd->pm_perfctr); /* RDMSR serializes */
 	if (PMC_IS_SAMPLING_MODE(mode))
-		*v = AMD_PERFCTR_VALUE_TO_RELOAD_COUNT(tmp);
+		/*
+		 * The counters are 48 bit - so we need to "pretend" the 48 bit value
+		 * is 64 bit for the 2s compliment conversion to convert correctly.
+		 * I don't think this is "correct" answer!
+		 */
+		*v = AMD_PERFCTR_VALUE_TO_RELOAD_COUNT(tmp | 0xffff000000000000);
 	else
 		*v = tmp;
 
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list