Second "RFC" on pkg-data idea for ports

Garance A Drosihn drosih at rpi.edu
Sun Apr 18 19:07:29 PDT 2004


At 1:40 PM -0500 4/17/04, Mark Linimon wrote:
>
>You've mentioned (in a later maessage than this one) that you
>have some ideas about future directions that could spring from
>this work, but that they are not yet fully-formed enough to be
>written down.  While that's fair enough, until that's done, it's
>really hard to weigh the tradeoffs involved in doing all this
>(IMHO) extensive work.  But if that's the case, then what you're
>trying to address is not just the inodes problem.

This is correct.  I really hope to address much more than the
simple inodes problem, but those ideas need more work before I
would want to spring them on anyone.

>Lacking that, what we have is a proposal to address the inodes
>problem.

Basically, yes.  A little more than that, but not much.  I also
wanted to add copyright information per-port, and have a way to
that users&developers could refer to "port-version X.Z of blah",
where X.Z is the version of the entire collection of files for
freebsd-port "blah".  I wanted to be able to add that information,
without increasing the size of the ports-collection by too much.

And in fact, it looks like I can add that, and decrease the size
by about 25-30%.  That's a reduction in filespace, not just inodes.

>1. (easy) If the distinfo lines were moved into the Makefiles,
>    that would result in a savings of 9568 files out of 10149
>    ports (60075 files), for about 16%.  (Note: I'm using the
>    numbers from an old tree, but the percentage has probably
>    not changed significantly).
>
>    (Disclaimer: although I personally am not really fond of
>    this solution due to the repo-churn it would create, I
>    know that other people are pushing for this to be done).

I am not fond of that either, as I feel it is a step in the
wrong direction.  Well, I guess it depends on how it is done.
If it is done as more MAKE variables, then I do think it's a
step in the wrong direction.  Other people have other opinions,
but that is my opinion and so far no one has convinced me
otherwise.

One reason I'm trying to promote my initial pkg-data project
now is partially to address that, but taking things in (what
I consider) a better direction.

Another reason is that I notice people are considering the
idea of restructuring all of ports, to have a three-level
setup instead of a two-level setup.  If that is done, it will
create a lot of churn.   If the pkg-data ideas were done, that
also creates a lot of churn -- so I am thinking it would be
easier on repositories to try to get both big-churn projects
in at the same time.

>2. (intermediate) Let's change the way we think about
>    patchfiles.

What you describe would be a huge project.  It is not doable
by one or two people.  You need every ports-developer to sign
up for that extra work.

However, we could perhaps do something where patch-files are
separate from all the other information-files of a port.  That
way people would download freebsd-patches the same way they
download the original tarballs.  Which is to say, you would
only download the patches for things you actually INSTALL,
instead of the patches for all 10,000 ports that you "might"
want to install.  In fact, on my sparc64 machine, the files I
have to download includes patches that I *cannot* install.

Doing something along those lines might be an interesting idea.

>3. (advanced) Right now our default assumption is that to
>    install any ports,  you have to install the entire ports
>    collection.  This is true whether you install ports via
>    downloading and unzipping the tarball from our main site,
>    or use cvsup.  Perhaps it's time to reevaluate this
>    assumption.

I would definitely like to see some way to eliminate this need.
At the moment I have no good ideas on how to do that...  It is
a little frustrating that cvsup has the idea of "refuse" lists,
but the first thing we tell everyone about the ports collection
is "you must download the entire collection, every time".

I only have 100 ports installed, but to keep them up to date it
seems I need to download a 235-meg collection of files on how
to build ports.

>3b. ... My first attempt ..., led me to the conclusion that the
>     gain from partitioning out the "easy cases" was on the order
>     of 9% of the inodes.  I haven't pursued it further, because
>     9% didn't sound super-attractive to me; but ...

Well, fwiw note that initial tests of the pkg-data transformation
indicate a 58% reduction in inodes.  But that is preliminary, as
we can't yet say that what I have now will be the final form.

>(As an example, my other conclusion from that shell-script run
>was "everything depends on devel, and devel depends on everything
>else".  Since devel has 1184 ports in it, it's difficult to attack
>the overall problem without attacking devel ...)
>
>I honestly don't think anyone in the FreeBSD project really has
>a handle on what that dependency graph looks like.  And this is
>where I think your desire to have someone work on the inodes
>problem, who doesn't have an intricate knowledge of coding to
>the existing infrastructure, could be invaluble.

These are interesting things to consider.  We'll have to think
about them a bit and see if we could come up with something
along those lines.  While it is an advantage that Darren is not
too tied to the current infrastructure, there is the disadvantage
that he also has no sysadmin experience, and thus has no
gut-feeling for what issues come up when trying to do a ports-
collection.

-- 
Garance Alistair Drosehn            =   gad at gilead.netel.rpi.edu
Senior Systems Programmer           or  gad at freebsd.org
Rensselaer Polytechnic Institute    or  drosih at rpi.edu


More information about the freebsd-ports mailing list