cvs commit: src/usr.bin/tar Makefile bsdtar.1 bsdtar.c bsdtar.h
bsdtar_platform.h matching.c read.c util.c write.c
Tim Kientzle
tim at kientzle.com
Tue Apr 6 11:15:23 PDT 2004
Ruslan Ermilov wrote:
> On Mon, Apr 05, 2004 at 02:32:18PM -0700, Tim Kientzle wrote:
>
>>kientzle 2004/04/05 14:32:18 PDT
>>
>> FreeBSD src repository
>>
>> Added files:
>> usr.bin/tar Makefile bsdtar.1 bsdtar.c bsdtar.h
>> bsdtar_platform.h matching.c read.c
>> util.c write.c
>> Log:
>> Initial commit for bsdtar.
>>
>
> Awesome! Are there some benchmarking results available?
I haven't focused very closely on performance yet, to be honest, though
the internal architecture is pretty clean (minimal data copying;
reuse of internal buffers to avoid heap thrashing).
I did some quick tests early on and the performance (on dearchiving)
was roughly comparable to gnutar. (Within about 5-10%.) That will
improve some as I continue to work on it. However, in general,
I expect it to be a little bit slower because the compression
isn't handled in a separate process (thus there's less overlapping
of I/O and computation).
But, there are a lot of nice new features:
* Fully automatic format/compression detection.
In particular, the following commands all work:
bsdtar -xf file.tgz
bsdtar -xf file.tbz
bsdtar -xf file.cpio
or even
fetch -o - http://...../file.tgz | bsdtar -xf -
GNU tar can't do any of these; 'star' fails the last
one. To be fair, "Heirloom tar" does support all of these.
* Ability to interpolate an archive. The following
combines the contents of "foo1.tgz" and "foo2.cpio"
into a single archive called "out.tbz":
bsdtar -cjf out.tbz @foo1.tgz @foo2.cpio
Yes, you can mix interpolations and regular files on
the command line. You can even interpolate from stdin:
bsdtar -cjf - -F pax @-
converts an archive read on stdin into a pax-format,
bzip2-compressed archive on stdout. Once I get mtree
read support, you'll be able to convert an mtree file
into a shell script, for example:
bsdtar -cf tree.sh -F shar @tree.mtree
* Compliance with SUSv2. SUSv2 (POSIX.1-1997 ?) was
the last official spec for tar. GNU tar does not
comply with the file format specified there, nor does
it correctly implement the command-line options specified
there. By default, bsdtar will create standard ustar
archives unless it finds a file attribute that is not
supported by ustar (such as a very long filename or ACL),
in which case it will use SUSv3 (POSIX.1-2001) extensions
to carry the additional data. There are command-line options
to force straight ustar format or permit SUSv3 ("pax")
extensions even when not absolutely required. (The default
format won't use SUSv3 extensions just to store atime/ctime
or sub-second timestamps; specifying "pax" format will.)
* Support for SUSv3 extensions. The "pax" format extensions
eliminate essentially all of the historic limitations of
tar in a way that is easily extensible and compatible with
standard-compliant "pax" implementations on other platforms.
(as well as some modern tar implementations, notably Joerg
Schilling's "star")
* More complete archiving. With the "pax" format, bsdtar will
archive ino/dev/nlink, sub-second resolution mtime/ctime/atime,
ACLs, file flags, etc, etc. Not all of this can currently be
restored (ino/dev/nlink/ctime are currently ignored on extract),
but it's all stored in the archive.
* Broad format support. bsdtar reads the usual bevy of tar formats,
and some cpio archives (only the odc variant at the moment).
It writes standard tar formats, cpio, and shar. The
underlying libarchive library is extensible and I have plans
for reading mtree files, reading/writing more cpio
formats, reading ZIP archives, etc.
* Cleanly factored. The archive format support is all in a separate
library. It should be fairly routine to build "cpio" or "pax"
command-line interfaces to the same library or use the library for
"pkg_install" or "pkg_create." For comparison, right now "bsdtar"
is ~2,000 lines of C, "libarchive" is closer to 10,000 lines of C.
There is some performance work to be done; I need to build
a uid/gid/uname/gname cache, for example. Part of my recent rewrite
of the ACL support was to get to the point that there was one
place where all such lookups were handled, regardless of whether
it's a file owner or an ACL that needs the information.
There are still a few bugs to iron out and a couple of features that
are a bit incomplete, but it's getting better quickly. My hope
is that a few adventurous souls will start using it and giving
me feedback so that I can grow it into the system tar
that FreeBSD deserves.
Tim
More information about the freebsd-current
mailing list