/usr/bin/script eating 100% cpu with portupgrade and xargs

Jeremy Chadwick freebsd at jdc.parodius.com
Sun Sep 18 07:46:05 UTC 2011


On Sun, Sep 18, 2011 at 08:47:13AM +0200, Ronald Klop wrote:
> On Sun, 18 Sep 2011 07:39:01 +0200, Jeremy Chadwick
> <freebsd at jdc.parodius.com> wrote:
> 
> >On Sun, Sep 18, 2011 at 12:54:13AM -0400, Jason Hellenthal wrote:
> >>On Sun, Sep 18, 2011 at 01:49:15AM +0200, Ronald Klop wrote:
> >>> Hi,
> >>>
> >>> I'm running portupgrade in screen to update all the ports for
> >>> 9-BETA2/9-CURRENT on amd64. While doing this script eats 100% cpu.
> >>> Because portupgrade -fa crashed I'm running this command to update the
> >>> remaining non-updates ports.
> >>> find /var/db/pkg -name +DESC -mtime +2 |cut -d / -f 5 | xargs
> >>time nice -n
> >>> 20 portupgrade -f
> >>>
> >>> The output of truss -p `pgrep script` is this:
> >>> clock_gettime(13,{1316301104.000000000 })        = 0 (0x0)
> >>> select(5,{0 4},0x0,0x0,{30.000000 })             = 1 (0x1)
> >>> read(0,0x7fffffffcdf0,1024)                      = 0 (0x0)
> >>> write(4,0x7fffffffcdf0,0)                        = 0 (0x0)
> >>> clock_gettime(13,{1316301104.000000000 })        = 0 (0x0)
> >>> select(5,{0 4},0x0,0x0,{30.000000 })             = 1 (0x1)
> >>> read(0,0x7fffffffcdf0,1024)                      = 0 (0x0)
> >>> write(4,0x7fffffffcdf0,0)                        = 0 (0x0)
> >>> clock_gettime(13,{1316301104.000000000 })        = 0 (0x0)
> >>> select(5,{0 4},0x0,0x0,{30.000000 })             = 1 (0x1)
> >>> read(0,0x7fffffffcdf0,1024)                      = 0 (0x0)
> >>> write(4,0x7fffffffcdf0,0)                        = 0 (0x0)
> >>> clock_gettime(13,{1316301104.000000000 })        = 0 (0x0)
> >>> select(5,{0 4},0x0,0x0,{30.000000 })             = 1 (0x1)
> >>> read(0,0x7fffffffcdf0,1024)                      = 0 (0x0)
> >>> write(4,0x7fffffffcdf0,0)                        = 0 (0x0)
> >>>
> >>> So it is really fast in reading and writing 0 bytes most of the time.
> >>>
> >>> I also found
> >>http://web.archiveorange.com/archive/v/6ETvLvjo60Gj9geAUAb6
> >>> and I think I am better of by rewriting my command so stdin/stdout is
> >>> still the terminal. Although the link is a couple of years old.
> >>>
> >>> Is this known? Can somebody explain me why my xargs command is
> >>not working
> >>> well?
> >>>
> >>
> >>Are you absolutely sure that its script(1) causing this ? 100% CPU usage
> >>has been a known side effect of screen(1) for quite some time. Rebuild
> >>it and try again.
> >
> >Jason's referring to this, I believe:
> >http://www.freebsd.org/cgi/cvsweb.cgi/ports/sysutils/screen/Makefile#rev1.55
> >
> >To clarify the what the commit message means: it does not mean "when the
> >package is installed the installation takes up 100% CPU".  It means
> >"once the package is installed and screen is used, screen takes up 100%
> >CPU".  I know because I've seen this behaviour in the past (one of the
> >many, many reasons I build ports from source).
> >
> >However:
> >http://www.freebsd.org/cgi/cvsweb.cgi/ports/sysutils/screen/Makefile#rev1.78
> >
> >So: If a binary package is being installed through your above
> >portupgrade command, and you're seeing this problem, then it sounds to
> >me like commit revision 1.78 is a regression and NO_PACKAGE should be
> >put back into place + packages removed from all mirrors.
> >
> >There are many reasons to not use GNU screen at all, or if you must have
> >something like it, use tmux.  I recently had to provide an analysis of
> >how GNU screen destroys one's terminal[1]; so if the above problem turns
> >out to be caused by GNU screen as well, I'll just add it to my
> >ever-growing list of reasons the software should be nuked from orbit.
> >
> >Otherwise, if this turns out to be a problem with portupgrade (which you
> >found some evidence supporting such), then the solution is simple: stop
> >using portupgrade, use portmaster (if it lacks things you need ask Doug
> >Barton, he's incredibly receptive to adding new features/fixing things).
> >Two databases that aren't compatible, ruby shims, and other crap = not
> >worth it.  Think the database ordeal is long over with/fixed/whatever?
> >It isn't[2].
> >
> >[1]: http://lists.freebsd.org/pipermail/freebsd-stable/2011-June/063052.html
> >[2]: http://www.dslreports.com/forum/r26304856-FreeBSD-defining-portmaster-alias
> >
> 
> I have a repeatable test. Run top in a window and this command in another.
> $ echo test | script /tmp/script-test sleep 1000
> Script started, output file is /tmp/script-test
> test
> 
>   PID USERNAME       THR PRI NICE   SIZE    RES STATE   C   TIME
> CPU COMMAND
> 29656 ronald           1 103    0 12324K  1244K CPU4    4   1:03
> 100.00% script
> 
> So it has nothing to do with portupgrade or screen. The output of
> truss -p29656 is the same as posted previously.

Good deal.

I can reproduce this problem too, on RELENG_8 dated 2011/09/16.  I
haven't tried older FreeBSD releases.

$ echo test | script /tmp/script-test sleep 1000
Script started, output file is /tmp/script-test
test

  PID USERNAME   THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
33803 jdc          1 114    0  7020K  1236K CPU1    1   0:22 79.98% script /tmp/script-test sleep 1000

Relevant fstat to examine fds:

# fstat -p 33803
USER     CMD          PID   FD MOUNT      INUM MODE         SZ|DV R/W
jdc      script     33803 root /             2 drwxr-xr-x     512  r
jdc      script     33803   wd /home         6 ?---------  18446744071573739670  r
jdc      script     33803 text /usr     683285 -r-xr-xr-x   10152  r
jdc      script     33803    0* pipe ffffff0007bc62d8 <-> ffffff0007bc6430      0 rw
jdc      script     33803    1 /dev        128 crw--w----   pts/1 rw
jdc      script     33803    2 /dev        128 crw--w----   pts/1 rw
jdc      script     33803    3 /tmp          7 -rw-------      61  w
jdc      script     33803    4* pseudo-terminal master      pts/2 rw

ktrace output, since truss never behaves well on FreeBSD (I gave up
trying to use it to accomplish anything long ago):

# ktrace -t+ -p 33803
# ktrace -C
# ls -l ktrace.out
-rw-------  1 root   wheel     155125652 Sep 18 00:41 ktrace.out
# kdump
 33803 script   CALL  read(0,0x7fffffffe120,0x400)
 33803 script   GIO   fd 0 read 0 bytes
       ""
 33803 script   RET   read 0
 33803 script   CALL  write(0x4,0x7fffffffe120,0)
 33803 script   GIO   fd 4 wrote 0 bytes
       ""
 33803 script   RET   write 0
 33803 script   CALL  clock_gettime(0xd,0x7fffffffe050)
 33803 script   RET   clock_gettime 0
 33803 script   CALL  select(0x5,0x7fffffffe070,0,0,0x7fffffffe0f0)
 33803 script   RET   select 1
 33803 script   CALL  read(0,0x7fffffffe120,0x400)
 33803 script   GIO   fd 0 read 0 bytes
       ""
 33803 script   RET   read 0
 33803 script   CALL  write(0x4,0x7fffffffe120,0)
 33803 script   GIO   fd 4 wrote 0 bytes
       ""
 33803 script   RET   write 0
 33803 script   CALL  clock_gettime(0xd,0x7fffffffe050)
 33803 script   RET   clock_gettime 0
 33803 script   CALL  select(0x5,0x7fffffffe070,0,0,0x7fffffffe0f0)
 33803 script   RET   select 1
 33803 script   CALL  read(0,0x7fffffffe120,0x400)
 33803 script   GIO   fd 0 read 0 bytes
       ""
 33803 script   RET   read 0
 33803 script   CALL  write(0x4,0x7fffffffe120,0)
 33803 script   GIO   fd 4 wrote 0 bytes
       ""
 33803 script   RET   write 0
 33803 script   CALL  clock_gettime(0xd,0x7fffffffe050)
 33803 script   RET   clock_gettime 0
 33803 script   CALL  select(0x5,0x7fffffffe070,0,0,0x7fffffffe0f0)
 33803 script   RET   select 1
 33803 script   CALL  read(0,0x7fffffffe120,0x400)
 33803 script   GIO   fd 0 read 0 bytes
       ""
 33803 script   RET   read 0
 33803 script   CALL  write(0x4,0x7fffffffe120,0)
 33803 script   GIO   fd 4 wrote 0 bytes

CC'ing Jilles, as he might have some ideas.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |



More information about the freebsd-stable mailing list