Batch file question - average size of file in directory

Wed Jan 3 11:10:28 PST 2007

On 2007-01-03 10:42, Kurt Buff <kurt.buff at gmail.com> wrote:
> On 1/2/07, James Long <list at museum.rain.com> wrote:
> <snip my problem description>
> >Hi, Kurt.
> >
> >Can I make some assumptions that simplify things?  No kinky filenames,
> >just [a-zA-Z0-9.].  My approach specifically doesn't like colons or
> >spaces, I bet.  Also, you say gzipped, so I'm assuming it's ONLY gzip,
> >no bzip2, etc.
>
> Right, no other compression types - just .gz.
>
> Here's a small snippet of the directory listing:
>
> -rw-r-----  1 kurt  kurt   108208 Dec 21 06:15 dummy-zKLQEWrDDOZh
> -rw-r-----  1 kurt  kurt    24989 Dec 28 17:29 dummy-zfzaEjlURTU1
> -rw-r-----  1 kurt  kurt    30596 Jan  2 19:37 stuff-0+-OvVrXcEoq.gz
> -rw-r-----  1 kurt  kurt     2055 Dec 22 20:25 stuff-0+19OXqwpEdH.gz
> -rw-r-----  1 kurt  kurt    13781 Dec 30 03:53 stuff-0+1bMFK2XvlQ.gz
> -rw-r-----  1 kurt  kurt    11485 Dec 20 04:40 stuff-0+5jriDIt0jc.gz
>
>> Here's a first draft [...]
>
> Hmmm....
>
> That's the same basic approach that Giogos took, to uncompress the
> file and count bytes with wc. I'm liking the 'zcat -l' contstruct, as
> it looks more flexible, but then I have to parse the output, probably
> with grep and cut.

Excellent.  I didn't know about the -l option of gzip(1) until today :)

You can easily extract the uncompressed size, because it's always in
column 2 and it contains only numeric digits:

    gzip -l *.gz *.Z *.z | awk '{print $2}' | grep '[[:digit:]]\+'

Then you can feed the resulting stream of uncompressed sizes to the awk
script I sent before :)