Tim Kientzle <tim at> wrote:
> Brian F. Feldman wrote:
> > Tim Kientzle <tim at> wrote:
> > 
> > Oh, I was only implementing it inside the libarchive default routines 
> > because it was easy.  I also think that if this is done to speed up tar, it 
> > should be done in tar and not libarchive.  What are you using to benchmark?  
> > I'm interested in seeing what happens with a worker thread/with a larger 
> > decompression buffer/etc.
> I think I was just using:
> bsdtar -cvjf /dev/null /usr
> for compression benchmarking.  For decompression benchmarking,
> something like
> bsdtar -tf archive.tbz >/dev/null
> should suffice.

Okay, I tested it out and hammered it into shape so it could actually 
perform concurrently (see the updated pach to see how I finally implemented 
that prototype,

I saw only the tiniest speed difference in both tar tf and tar xfv -- it's 
hard to measure any difference whatsoever.  With full concurrency it might 
even be a little slower, and bsdtar is ALWAYS faster than the 
(multi-process) tar tfy/xfy!  So don't bother doing anything else to compete 
in that respect if that's all you want to beat.

One thing I did notice was the huge amount of calls that got done back and 
forth from the tar reader and the bzip2 reader and the file reader; it 
should be possible to get more speed out of bsdtar by actually pulling in 
the entire size of a block of whatever decompressor is being used.  For 
example a bzip2 can be 900KB blocks and the current buffer size is 10KB, and 
for anything at all to be decompressed the entire thing had to have been run 
through bzip2, building up its internal state to 3700KB during the process.  
If there's enough speed benefit, it's silly to save a little bit of space by 
using a very small "file read" buffer.  For S_ISREG() use no KB instead of 
10KB by using mmap(2), maybe...

It's funny how you say the performance might be bad compared to gnutar even 
though in decompression you actually save a fair bit of CPU!

