Bash script to find out the summary of user memory usage [not working]

Mon Dec 17 07:51:02 PST 2007

On 2007-12-17 06:00, Patrick Dung <patrick_dkt at yahoo.com.hk> wrote:
> I have correction with the script but still doesn't work:
>
> #!/usr/local/bin/bash
> for user in `ps -A -o user | sort | uniq | tail +2`
>  do
>         echo "user: $user"
>
>    ps aux -U $user | tail +2 | while read line
>    do
>
>     mem=`echo $line | awk {'print $4'}`
>         echo "mem: $mem"
>         TMPSUMMEM=`awk -v x=$mem -v y=$TMPSUMMEM 'BEGIN{printf
> "%.2f\n",x+y}'`
>         echo "summem: $TMPSUMMEM"
>    done
>         echo "finalsummem: $SUMMEM"
>         export SUMMEM=$TMPSUMMEM
>  done
>
>         echo "finalsummem: $SUMMEM"

There are *many* race conditions in that script.  For example, there's
no guarantee that once you get a snapshot of the "ps -A -o user" output,
then the same users will be listed in the loop you are running for each
username.

The script is also a bit 'sub-optimal' because it calls ps(1) and parses
its output many times (at least as many times as there are users).  A
much better way to `design' something like this would be to keep a hash
of the usernames, and keep incrementing the hash entry for each user as
you hit ps(1) output lines.

I'm not going to even bother writing a script to use a hash in bash(1),
because there are much better languages to work with hashes,
dictionaries or even simple arrays.

Here's for example a Python script which does what I described:

     1  #!/usr/bin/env python
     2
     3  import os
     4  import re
     5  import sys
     6
     7  try:
     8      input = os.popen('ps xauwww', 'r')
     9  except:
    10      print "Cannot open pipe for ps(1) output"
    11      sys.exit(1)
    12
    13  # Start with an empty dictionary.
    14  stats = {}
    15
    16  # Regexp to strip the ps(1) output header.
    17  header = re.compile('USER')
    18
    19  for line in input.readlines():
    20      if header.match(line):
    21          continue
    22      fields = line.split()
    23      if not fields or len(fields) < 4:
    24          continue
    25
    26      (username, mem) = (fields[0], float(fields[3]))
    27      value = None
    28      try:
    29          value = stats[username]
    30      except KeyError:
    31          pass
    32
    33      if not value:
    34          stats[username] = 0.0
    35      stats[username] += mem
    36
    37  # Print all the stats we have collected so far.
    38  keys = stats.keys()
    39  if len(keys) > 0:
    40      total = 0.0
    41      print "%-15s %5s" % ('USERNAME', 'MEM%')
    42      for k in stats.keys():
    43          print "%-15s %5.2f" % (k, stats[k])
    44          total += stats[k]
    45      # Finally print a grand total of all users.
    46      print "%-15s %5.2f" % ('TOTAL', total)

It's not the shortest Python script one could write to do what you
describe, but I've gone for readability rather than speed or
conciseness.

Running this script should produce:

    $ ./foo.py
    USERNAME         MEM%
    _pflogd          0.10
    daemon           0.00
    bind             1.10
    _dhcp            0.10
    keramida        38.60
    smmsp            0.10
    root            10.10
    build            0.00
    TOTAL           50.10
    $

PS: Yes, you could probably do the same in bash, with sed, awk and a bit
of superglue, but I prefer Perl and/or Python for anything which
involves something a bit more involved than simple string substitution
these days...