top(1) loses process user time count when threads end
John Baldwin
jhb at freebsd.org
Mon Aug 22 16:21:40 UTC 2011
On Tuesday, August 16, 2011 9:25:20 am John Baldwin wrote:
> On Saturday, August 06, 2011 1:57:35 pm Yuri wrote:
> > On 08/06/2011 02:11, Alexander Best wrote:
> > > On Fri Aug 5 11, Yuri wrote:
> > >> I have the process that first runs in 3 threads but later two active
> > >> threads exit.
> > >>
> > >> top(1) shows this moment this way (1 sec intervals):
> > >> 30833 yuri 3 76 0 4729M 4225M nanslp 4 0:32 88.62% app
> > >> 30833 yuri 3 76 0 4729M 4225M nanslp 6 0:34 90.92% app
> > >> 30833 yuri 1 96 0 4729M 4225M CPU1 1 0:03 1.17% app
> > >> 30833 yuri 1 98 0 4729M 4226M CPU1 1 0:04 12.89% app
> > >>
> > >> Process time goes down: 0:34 -> 0:03. Also WCPU goes down 90.92% ->
> > >> 1.17% even though this process is CPU bound and does intense things
> > >> right after threads exit.
> > >>
> > >> getrusage(2) though, called in the process, shows the correct user time.
> > >>
> > >> I think this is the major bug in the process time accounting.
> > > could you check, whether kern/128177 or kern/140892 describe your situation?
> >
> > I have ULE scheduler. kern/128177 talks about single thread with ULE
> > scheduler, and my issue is with threads. So I am not sure if it is
> > related. There have been no motion on kern/128177 since Feb 9, 2009.
> > kern/140892 is probably the same as mine.
> >
> > In any case, both these PRs have to be fixed since they are very user
> > visible, not just some obscure issues.
I am not able to reproduce this issue on 8-stable (this is without my
suggested patch). I wrote a simple program to fire off 4 CPU bound
threads that were signalled to exit when I hit a key on stdin. After
that the main thread started doing CPU intensive work.
While my CPU % dropped (the main thread was idle in getchar() while
the worker threads ran), I did not see a similar drop in the process
time:
7644 jhb 5 83 0 10664K 1492K CPU3 3 0:18 150.29% threadtime
7644 jhb 5 85 0 10664K 1492K CPU3 3 0:22 175.05% threadtime
7644 jhb 5 86 0 10664K 1492K CPU3 3 0:26 197.46% threadtime
7644 jhb 5 88 0 10664K 1492K CPU3 3 0:30 217.97% threadtime
7644 jhb 5 89 0 10664K 1492K CPU3 3 0:34 236.72% threadtime
7644 jhb 5 91 0 10664K 1492K CPU3 3 0:38 253.52% threadtime
7644 jhb 5 92 0 10664K 1492K CPU3 3 0:42 269.14% threadtime
7644 jhb 1 73 0 10664K 1432K CPU0 0 0:45 6.40% threadtime
7644 jhb 1 76 0 10664K 1432K CPU4 0 0:46 15.28% threadtime
7644 jhb 1 78 0 10664K 1432K CPU0 0 0:46 23.39% threadtime
7644 jhb 1 81 0 10664K 1432K CPU0 0 0:47 30.76% threadtime
7644 jhb 1 83 0 10664K 1432K CPU0 0 0:48 37.50% threadtime
7644 jhb 1 85 0 10664K 1432K CPU0 0 0:49 43.55% threadtime
7644 jhb 1 86 0 10664K 1432K CPU0 0 0:50 49.07% threadtime
Can you come up with a simple test case that shows the bug for you? Also,
can you try the patch I posted earlier?
--
John Baldwin
More information about the freebsd-hackers
mailing list