Re: git: b7f05445c00f - main - Add WWW entries to port Makefiles

From: Stefan_Eßer <se_at_freebsd.org>
Date: Fri, 09 Sep 2022 11:14:19 UTC
Am 08.09.22 um 20:45 schrieb Dmitry Marakasov:
> * Stefan Eßer (se@freebsd.org) wrote:
> 
>>> Why weren't all urls moved to WWW? For my ports, if there were multiple
>>> WWW's, all of them important and thus all of them should be moved to
>>> WWW=. Is there a policy which disallows multiple URLs in WWW?
>>
>> The contents of the WWW variable are made available in the INDEX
>> and in the package manifests.
>>
>> If multiple lines starting with "WWW:" were present in pkg-descr,
>> then only the URL from the first line was used for that purpose.
>>
>> But there were quite a number of ports that added a generic
>> framework URL (e.g. to rubyonrails.org), and the order of WWW:
>> lines was not always correct (e.g. with some less important URL
>> in the first line).
>>
>> All URLs have been preserved, either in the Makefile or in the
>> pkg-descr file. If the one in the Makefile is not the one you
>> want to be copied into the INDEX, then put another one in.
>>
>> No decision has been made whether more than URL may be defined
>> in WWW, but in that case the order of entries becomes arbitrary,
>> again, and it will be impossible to identify the most relevant
>> URL from its presence in the WWW variable.
> 
> The order was and will remain defined as pkg-descr lines and WWW
> items are ordered. For the cases where only a single item is allowed,
> taking the first item was and is the obvious option.

This did not work in the past (many pkg-descr files did not have
the relevant URL as the first entry) and is unlikely to work in
the future.

The documented use of the WWW field in the Makefile is to provide
a link to a project website or other documentation that helps the
end user e.g. to configure and use the software.

This variable is not meant to let the port maintainer know where
to use for updates, for example. And most users will not be
interested in the GitHub repository used to fetch the sources,
but rather in a link to a WiKi or other project resource that
is oriented towards the end user.

A single link with information for this specific purpose is most
useful, easy to check and maintain. Most cases of multiple links
in pkg-descr files either pointed at a generic framework site
(e.g. rubyonrails.org) or the repository (which oftenn does not
provide any relevant usage information, but is oriented towards
developers and porters).

>> In ports you maintain I see additional URLs only referencing the
>> repository directory where a port is maintained (e.g. on GitHub),
>> and only in a very small fraction of your ports.
>>
>> There generally is one official project website and other relevant
>> information is linked to that starting point.
> 
> Sometimes that's the case, sometimes not. For instance, most python
> modules have both PyPI url and git repository, none of these is in
> fact an "offical homepage" and both are equally important.

The end user is interested in the PyPI URL. The git repository is
of interest to developers and those find the repository without a
need to refer to the WWW field. It is possible to document such
URLs in various places, including the Makefile and the pkg-descr
file, but the WWW variable is not meant to be used for this purpose.

>> The only exception appears to be https://mg.pov.lt/objgraph/ and
>> that URL is easily found on the website in the WWW field of that
>> port's Makefile.
>>
>> I really do not see your point. It is hard enough to have a single
>> valid URL in the WWW field of each port, and I plan to add a tool
>> that tests for stale URLs.
> 
> There already is, it's repology.org

Please show me how that solves the issue I'm talking about.

There are hundreds of URLs that point at server hosts that do not
exist in the DNS. And many more URLs that are otherwise stale, but
which need more effort to be verified.

>> Having multiple URLs in WWW instead of the one that is most
>> relevant of a prospective user of the package will lessen the
>> value of this information, IMHO.
> 
> That sounds like a nonsence to me. Instead, leaving only one URL
> where there can be multiple URLs is losing important information,
> and having urls in different places is a pessimization.

There is quality and quantity of information - and you want quantity
while I aim for quality.

> Summarizing, I assume it's allowed to have multiple entries in WWW
> and I plan to move all remaining urls there. It's also great news
> that these WWWs will make their way into INDEX, as the named repology
> uses these URLs to match projects, and having multiple URLs will
> increase connectivity of the graph.

I'm not sure that your assumption is valid and you are going to break
downstream scripts (including some in our ports system).

The WWW variable has been documented and the change to a single value
has been accepted by the portmgr and docs team.

The WWW variable has not been introduced to simplify or improve the
information easily available for repology. If you want to change the
definition of WWW then create a review and get it accepted.

If you put multiple URLs into the WWW field you'll be responsible for
the fall-out.

Regards, STefan