[Bug 198149] [hwpmc] pmcstat -P -t (top mode, process sampling) stops after a while

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Fri May 8 10:58:46 UTC 2015


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=198149

--- Comment #14 from John Baldwin <jhb at FreeBSD.org> ---
This survived an overnight run with pmcstat still getting samples this morning.
 I added debugging to print a message each time one of these fixes was applied
(and in the case of the first patch, I outputted the raw value of the PMC). 
Both of these conditions fired fairly consistently during the test (once every
few seconds or so).  In addition, when I had run with just the first patch, I
had seen raw PMC counter values that in my debug messages that could be a bit
large, for example:

CPU 1: counter overflowed: 87516
CPU 1: counter overflowed: 22
CPU 12: counter overflowed: 2
CPU 1: counter overflowed: 2
CPU 1: counter overflowed: 13629
CPU 1: counter overflowed: 2
CPU 1: counter overflowed: 2

With both patches applied I do not see "large" values, only small ones:

CPU 5: fixing zero PMC
CPU 1: counter overflowed: 20
CPU 1: fixing zero PMC
CPU 1: counter overflowed: 2
CPU 1: fixing zero PMC
CPU 1: fixing zero PMC
CPU 15: fixing zero PMC
CPU 1: fixing zero PMC
CPU 1: counter overflowed: 4
CPU 1: counter overflowed: 2
CPU 1: counter overflowed: 2
CPU 1: fixing zero PMC
CPU 1: counter overflowed: 2
CPU 1: fixing zero PMC
CPU 1: counter overflowed: 2
CPU 8: fixing zero PMC
CPU 1: counter overflowed: 2
CPU 1: counter overflowed: 2
CPU 1: counter overflowed: 2
CPU 2: counter overflowed: 2
CPU 9: counter overflowed: 2
CPU 3: fixing zero PMC
CPU 8: fixing zero PMC
CPU 1: counter overflowed: 22
CPU 1: fixing zero PMC
CPU 1: fixing zero PMC
CPU 1: counter overflowed: 2
CPU 1: counter overflowed: 27
CPU 1: counter overflowed: 5
CPU 1: fixing zero PMC
.....

I had wondered if the second bug (writing a PMC value of zero) could have been
the source of the second bug (you can see how it would easily trigger it: if
you write a PMC of zero and the event happens 2 times before your next context
switch you would have a raw value of "2" when you switched out).  However,
whilt it seems to have fixed some of them (the "large" ones) it does not seem
to have fixed all of them.  I definitely think the second fix is probably legit
(and has been present since sampling was added to PMC).  I think the first
change is also technically correct, but I'm not sure why we are seeing those
values.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list