Tar output mode for installworld

Tim Kientzle kientzle at freebsd.org
Sun Jul 15 18:46:12 UTC 2007


Paul Schenkeveld wrote:
>>... read, extract, etc, a format I'm calling "ntree"
>>for now. ... lines have the following format:
>>   <filename> <key>=<value> <key>=<value> ...
>>Where key is one of:
>>  time ... gid,uid ... gname,uname ... mode ... content ...

> ...  I've been playing a lot with automating and customizing
>  the build/distribute/install process of FreeBSD ...

Yes, there are a lot of interesting games you can
play if you can build a description of a .tgz file and
then handily generate a .tgz file from that description.
I was pleased that bsdtar's already-existing "archive conversion"
feature made this so simple to implement.

> For the $CUSTOMIZATIONS to work it would be very nice to extend your
> proposal with the following features:
>   - Allow scattered lines describing attributes of the same file but
>     retain the order in which they appear so that $CUSTOMIZATIONS can
>     override attributes set by make installworld.

I already intend to support multiple lines for the same
file to allow attributes to be specified separately:
  file1 type=file mode=04666
  file1 uname=root gname=wheel
  file1 flags=noschg

It's trivial to ensure that later options override
earlier ones.  The tricky part is the word "scattered."
This either requires that libarchive read the entire
description file into memory up front (awkward, but not
particularly unreasonable) or that we have a separate
tool that can move those scattered lines so they occur together.
(My initial idea was that /usr/bin/sort would suffice,
but I don't think it will.  In particular, tar requires
that hardlinks follow the object being linked to, which
is not something that /usr/bin/sort is going to preserve.)

>   - Implement something like:    <filename> remove

Hmmm...  Interesting idea.  It would also be
interesting to extend the tar format with a "whiteout"
entry that erases an entry already on disk.  Hmmm....
does this raise any security issues?  I'll have to
think about that one.

Question:  Should
    file1 type=file
    file1 remove

only remove the file from this archive (i.e., not
create file1) or should it also look on disk and
remove file1 if it's there.  Obviously, the latter
can only be carried through a .tgz installation file
if the tar format supports whiteouts.

>   - Implement something like:
>       <filename> move=<newpath>
>     to allow $CUSTOMIZATIONS to move things around.
>     A special case here should be observed:
>       /bin/foo ...
>       ...
>       /bin/foo move=/bin/foo.orig
>       /bin/foo ...

Implementable if libarchive read the entire description into
memory up front, but otherwise very tricky.  Looks like it
will be mandatory for the libarchive support to read the entire
description into memory so it can preprocess it before returning
complete entries.

> Having a file describing everything that gets installed would also benefit
> later upgrades to a system.

One of my questions:  Does my proposed format suffice for these
other purposes?  If not, what other features would be required?
Is it worth trying to design a single format that handles these
various cases?

>  In my experiments I've implemented a manifest
> that gets shipped with the tarball.tgs ... includes md5 and/or sha256 ...

There is some real value to having md5 and/or sha256 checks in generic
tar archives as well.  I have a couple of ideas for supporting
this using pax extensions, but it's tricky to implement robustly,
so it's not going to happen for a little while.  (In particular, the
error check in the tar archive should include all of the headers,
and you want to avoid a separate scan of the file to collect an
error check before archiving it.)

Tim Kientzle


More information about the freebsd-hackers mailing list