misc/126230: weird process cpu usage stats and weird load average

Jakub Jasinski jakub.jasinski at utoronto.ca
Tue Aug 5 21:00:10 UTC 2008


The following reply was made to PR misc/126230; it has been noted by GNATS.

From: Jakub Jasinski <jakub.jasinski at utoronto.ca>
To: Kris Kennaway <kris at FreeBSD.org>
Cc: freebsd-gnats-submit at FreeBSD.org
Subject: Re: misc/126230: weird process cpu usage stats and weird load
 average
Date: Tue, 5 Aug 2008 17:05:11 -0400

 It looks like a select-related bug.
 
 I compiled the sample select program (except that it waits for 120
 seconds):
 
 #include <stdio.h>
 #include <sys/time.h>
 #include <sys/types.h>
 #include <unistd.h>
 
 int
 main(void)
 {
         fd_set          rfds;
         struct timeval  tv;
         int             retval;
 
         /* Watch stdin (fd 0) to see when it has input. */
         FD_ZERO(&rfds);
         FD_SET(0, &rfds);
         /* Wait up to five seconds. */
         tv.tv_sec = 120;
         tv.tv_usec = 0;
 
         retval = select(1, &rfds, NULL, NULL, &tv);
         /* Don't rely on the value of tv now! */
 
         if (retval == -1)
                 perror("select()");
         else if (retval)
                 printf("Data is available now.\n");
         /* FD_ISSET(0, &rfds) will be true. */
         else
                 printf("No data within 120 seconds.\n");
 
         return 0;
 }
 
 And this is what I get after running rsync:
 
 last pid:  1193;  load averages:  0.00,  0.00,  0.00          up 0
 +01:47:18  16:51:41 63 processes:  1 running, 62 sleeping
 CPU states:  0.3% user,  0.0% nice, 19.7% system,  1.0% interrupt,
 78.9% idle Mem: 146M Active, 236M Inact, 247M Wired, 116K Cache, 214M
 Buf, 7284M Free Swap: 8192M Total, 8192M Free
 
   PID USERNAME    THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU
 COMMAN
 590 root 2   4    0 14112K  2516K select 2   0:20 9.42% apcupsd
 1193 root 1 -16    0 17196K 12176K wdrain 0 0:03  8.15% rsync
 767 root 1   4    0 30108K  5208K select 0 0:12  5.96% sendmail
 684 root 1   4    0  9940K  1260K select 3   0:11  5.08% authdaemond
 1192 root          1  46    0  7656K  2192K CPU0   0   0:01  3.66% top
 1188 admin         1   4    0  2532K   660K select 3   0:02  3.56% a.out
 
 [snipped]
 
 a.out is consuming 3.56% while select-ing.  That's funny. :)
 
 On Tue, 5 Aug 2008 18:10:03 GMT
 Kris Kennaway <kris at FreeBSD.org> wrote:
 
 > The following reply was made to PR misc/126230; it has been noted by
 > GNATS.
 > 
 > From: Kris Kennaway <kris at FreeBSD.org>
 > To: Jakub Jasinski <jakub.jasinski at utoronto.ca>
 > Cc: freebsd-gnats-submit at FreeBSD.org
 > Subject: Re: misc/126230: weird process cpu usage stats and weird
 > load average Date: Tue, 05 Aug 2008 20:08:36 +0200
 > 
 >  Jakub Jasinski wrote:
 >  > uhci2: <Intel 631XESB/632XESB/3100 USB controller USB-3> port
 >  > 0x1040-0x105f irq 18 at device 29.2 on pci0 uhci2: [GIANT-LOCKED]
 >  > uhci2: [ITHREAD]
 >  > usb2: <Intel 631XESB/632XESB/3100 USB controller USB-3> on uhci2
 >  > usb2: USB revision 1.0
 >  > uhub2: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on
 >  > usb2 uhub2: 2 ports with 2 removable, self powered
 >  > ukbd0: <No brand KVM A1, class 0/0, rev 1.10/0.00, addr 2> on uhub2
 >  > kbd2 at ukbd0
 >  > ums0: <No brand KVM A1, class 0/0, rev 1.10/0.00, addr 2> on uhub2
 >  > ums0: 5 buttons and Z dir.
 >  > 
 >  > Ok, so the keyboard and mouse are there.  I detached them, killed
 >  > moused, rerun rsync and I got the same weird top output:
 >  
 >  Are the devices still attached (i.e. did they detach)?  What if you 
 >  compile a kernel with no USB support at all (obviously you won't be
 > able to log in with a USB keyboard, but you can still log in
 > remotely). 
 >  Kris
 >  
 >  
 >  > 
 >  > FreeBSD 7.0-RELEASE (CUSTOM) #2: Mon Aug  4 12:44:23 EDT 2008
 >  > last pid:  1258;  load averages:  0.00,  0.00,  0.00   up 0
 >  > +01:09:53 13:20:17 last pid:  1266;  load averages:  1.98,  0.55,
 >  > 0.20   up 0 +01:11:33  13:21:57 58 processes:  1 running, 57
 >  > sleeping CPU states:  0.9% user,  0.0% nice, 36.0% system,  3.5%
 >  > interrupt, 59.6% idle Mem: 142M Active, 59M Inact, 278M Wired, 48K
 >  > Cache, 214M Buf, 7433M Free Swap: 8192M Total, 8192M Free
 >  > 
 >  >   PID USERNAME    THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU
 >  > COMMAND
 >  > 662 root 2   4    0 14112K  2504K select 3   0:15 12.65% apcupsd
 >  > 1266 root 1  54    0 17196K 12112K select 0 0:03 10.50% rsync
 >  > 756 root 1   4    0  9940K  1260K select 2 0:08  6.40% authdaemon
 >  > 1259 root 1  48    0  7656K  2328K CPU1   1   0:02  5.27% top
 >  > 981 root 1   4    0 30108K  5292K select 2   0:06  4.59% sendmail
 >  > 1265 root 1   4    0 17196K 11824K select 3   0:02  4.59% rsync
 >  > 790 root 1   4    0 9940K  1260K select 1   0:06  4.49% authdaemon
 >  > 774 root 1 4    0  3604K  1220K select 2   0:06  4.49% couriertcp
 >  > 791 root 1   4    0  9940K  1260K select 0   0:06  4.49% authdaemon
 >  > 
 >  > etc.
 >  > 
 >  > ...and why are apcupsd (connected via serial port),  syslogd,
 >  > couriertcpd, sendmail, etc. working a the _exact_ same time for the
 >  > same duration at around the same WPCU%?
 >  > 
 >  > Just for comparison, this is a summary of top on FreeBSD 6.2 (32
 >  > bit) on another non-deployed box running pretty much the same
 >  > software (courier imap, clamav, spamassassin, etc.) after rsync is
 >  > activated: 43.48% for one rsync process
 >  > 1.96% for another
 >  > 0.00% for everything else.
 >  > 
 >  > On Tue, 5 Aug 2008 16:50:04 GMT
 >  > Kris Kennaway <kris at FreeBSD.org> wrote:
 >  > 
 >  >>  Jakub Jasinski wrote:
 >  >>  > I recompiled the kernel with the ULE scheduler and the only
 >  >>  > difference now is that every 'working' process has higher WCPU
 >  >>  > _after_ rsync is activated. Before rsync is activated all
 >  >>  > process have 0% WCPU.
 >  >>  > 
 >  >>  > BTW, thanks for the help.
 >  >>  > 
 >  >>  > [root at mail2 ~]# vmstat -i
 >  >>  > interrupt                          total       rate
 >  >>  > irq1: atkbd0                          78          0
 >  >>  > irq4: sio0                          2023          5
 >  >>  > irq14: ata0                           58          0
 >  >>  > irq16: uhci0 ehci0                     1          0
 >  >>  > irq18: ciss0 uhci2                  2618          6
 >  >>  > irq22: uhci4                          19          0
 >  >>  > cpu0: timer                       751994       1994
 >  >>  > irq257: bce1                        2271          6
 >  >>  > cpu1: timer                       744491       1974
 >  >>  > cpu2: timer                       744494       1974
 >  >>  > cpu3: timer                       744513       1974
 >  >>  > Total                            2992560       7937
 >  >>  
 >  >>  Your system is sharing an interrupt between the ciss and one of
 >  >> the USB controllers.  This means that every time there is an
 >  >> interrupt from disk I/O the USB drivers will also wake up to
 >  >> process it.  This could be causing things like moused and cups to
 >  >> run.  What devices are attached to uhci2?
 >  >>  
 >  > 
 >  > 
 >  
 > _______________________________________________
 > freebsd-bugs at freebsd.org mailing list
 > http://lists.freebsd.org/mailman/listinfo/freebsd-bugs
 > To unsubscribe, send any mail to
 > "freebsd-bugs-unsubscribe at freebsd.org"


More information about the freebsd-bugs mailing list