top(1) doesn't report the correct CPU time for a multithreaded process

Yuri yuri at rawbw.com
Wed Mar 13 23:11:05 UTC 2013


I have a process that is CPU bound with 1 thread in its first 5 seconds, 
then it creates 200 threads that are all reading/writing from the 
network, and becomes network bound for the other 6.5min.
When I look at this process in top(1), right after 200 threads are 
created, I see WCPU and CPU values around 3400% and then it goes down to 
the values below 1% for the rest of the run:
50619 yuri          206  20    0   621M   555M uwait   7   0:31 0.68% myapp

In the end, after all threads have quit, process measures its resources 
with getrusage(RUSAGE_SELF, &u); and it shows that CPU time consumed was 
like this:
user=104609ms sys=8758ms wall=395938ms

So "real" CPU percentage wasn't ~0.68%, but was more like 25%. Or maybe 
it is 6% if to consider 400% the max (there are 4 cores). I am inclined 
to trust getrusage(2).

It was this PR, that is now marked as closed with patch checked in: 
http://www.freebsd.org/cgi/query-pr.cgi?pr=127331
But it doesn't seem like this code from the patch is even in 
usr.bin/top/machine.c now (9.1-STABLE).
My original PR, considered a duplicate, is also closed: 
http://www.freebsd.org/cgi/query-pr.cgi?pr=135823

Why top(1) doesn't show the correct CPU time, aggregate for all threads? 
Is this a regression of the patch in the above PR#127331?
Also, why do I ever see 3400% CPU time? This doesn't seem right in any case.

Yuri



More information about the freebsd-hackers mailing list