"tar -c|gzip" faster than "tar -cz"?!?

Vasil Dimov vd at FreeBSD.org
Wed Oct 11 04:15:10 PDT 2006


On Tue, Oct 10, 2006 at 07:27:53PM +0200, Oliver Fromme wrote:
> Hi,
> 
> While doing some performance tuning of a backup script
> I noticed that the -z option of our (bsd)tar behaves in
> a very suboptimal way.  It's not only a lot slower than
> using gzip separately, it also compresses worse.
> 
> I compared the following two commands (cwd=/):
> 
> A.  tar -cz --one-file-system -f- . | wc -c
> B.  tar -c --one-file-system -f- . | gzip | wc -c
> 
> In order to measure the time of the whole command pipes,
> I encapsulated them into subshell calls like this:
> /usr/bin/time sh -c 'tar ... | wc -c'
> 
> These are results for multiple invocations of A (tar -cz):
> 
>    7.30 real   7.15 user   0.09 sys
>    7.28 real   7.13 user   0.12 sys
>    7.29 real   7.14 user   0.09 sys
> 
> And these are the numbers for B (tar -c | gzip):
> 
>    5.54 real   5.37 user   0.15 sys
>    5.54 real   5.34 user   0.18 sys
>    5.55 real   5.40 user   0.12 sys
> 
> My first thought was that "tar -z" would use a better
> compression level (e.g. -9) vs. the gzip default of -6,
> which would explain why it is slower.  Therefore I
> expected the resulting backup to be smaller -- but just
> the opposite is the case.  Command A resulted in a
> compressed size of 25364480 bytes, while B was a bit
> smaller (25306059 bytes).
> 
> I'm surprised because I expected "tar -z" to be faster
> than a separate gzip process (at the same compression
> level), or at least as fast.  But it's 30% slower.
> 
> Is that a known problem?  Is someone working on it?
> 

You (wrongly) assumed that two processed will do slower than a single
one. It's exactly the opposite. While the one is constantly reading disk
contents the other is constantly compressing. With one process you have
to read data, compress, read data, compress and so on which is
suboptimal (see Mike's reply too).

It is not a problem in any program nor a feature in another. It's just
how the things work.

-- 
Vasil Dimov
gro.DSBeerF at dv
%
Look, that's why there's rules, understand?
So that you think before you break 'em.
    -- (Terry Pratchett, Thief of Time)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 155 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20061011/216e8d19/attachment.pgp


More information about the freebsd-hackers mailing list