"tar -c|gzip" faster than "tar -cz"?!?

Oliver Fromme olli at lurza.secnetix.de
Wed Oct 11 05:38:07 PDT 2006


Vasil Dimov wrote:
 > You (wrongly) assumed that two processed will do slower than a single
 > one.

That assumption should be true, in general, at least on a
single-CPU machine.  With two processes, there is additional
overhead for data copying through the pipe.

 > It's exactly the opposite. While the one is constantly reading disk
 > contents the other is constantly compressing. With one process you have
 > to read data, compress, read data, compress and so on which is
 > suboptimal (see Mike's reply too).

Mike's reply isn't applicable, because no blocking on I/O
occurs.  All data is in RAM.  The amount of reads from disk
(or rather from cache) and writes to disk (or rather to
the wc command) is exactly the same in both cases.
In fact, the case with separate gzip involves more I/O
(namely the pipe, which more than doubles the I/O overhead).

As it turned out meanwhile, the code in libarcive + libz
seems to be less efficient than the code in gzip.  It's a
30% difference.  It has nothing to do with the number of
processes (whether there's one or two).  If tar used the
gzip code, it would be just as fast.

Best regards
   Oliver

PS:  Please respect Reply-to.  I'm reading the list and
don't need to receive another copy.

-- 
Oliver Fromme,  secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

 > Can the denizens of this group enlighten me about what the
 > advantages of Python are, versus Perl ?
"python" is more likely to pass unharmed through your spelling
checker than "perl".
        -- An unknown poster and Fredrik Lundh


More information about the freebsd-hackers mailing list