getting PKGNAME from CONFLICTS

Dan Langille dan at langille.org
Wed Aug 15 16:35:11 UTC 2018


> On Aug 15, 2018, at 8:46 AM, Matthew Seaman <matthew at FreeBSD.org> wrote:
> 
> On 15/08/2018 00:35, Dan Langille wrote:
>>> On Aug 14, 2018, at 2:55 PM, Mark Millard via freebsd-ports <freebsd-ports at freebsd.org> wrote:
>>> 
>>> 
>>> Dan Langille dan at langille.org wrote on
>>> Tue Aug 14 17:54:01 UTC 2018 :
>>> 
>>>> . . .
>>>> At https://dev.freshports.org/www/p5-CGI/ you can see:
>>>> 
>>>>  CONFLICTS: p5-CGI.pm-[1-3]*
>>>> . . .
>>>> To extract the PKGNAME values from the CONFLICTS I will need to remove everything after the trailing dash.
>>>> . . .
>>> 
>>> p5-
>>> vs.
>>> p5-CGI.pm-
>>> vs.
>>> p5-CGI.pm-[1-
>>> 
>>> It looks to me like "trailing dash" probably has a
>>> complicated definition where some "-"(s) may exist
>>> that are to be ignored after the one of interest.
>>> In the example I'm guessing that the middle
>>> "-" is intended (so "p5-CGI.pm-").
>> 
>> Agreed.  The hard part is identifying the regex and deleting it from consideration.
>> 
> 
> If you don't mind spawning a new process, you can just do:
> 
> % pkg search -qg 'p5-CGI.pm-[1-3]*'
> p5-CGI.pm-3.63_1,1
> 
> This does assume your pkg(8) is configured to use a repository with all
> possible packages available.  The default FreeBSD repositories are a
> good choice in that regard.

They are good, but the FreshPorts database has a collection of PKGNAME
values for every port.  I want to query those values.

> Or if you already have a database table with all of the package names
> and versions, then you'll presumably want to change the glob expression
> into a regex match (in this case something like '^p5-CGI\.pm-[1-3].*')

freshports.dev=# SELECT categoryport(id) FROM (
  SELECT distinct P.id
    FROM ports P JOIN commit_log_ports CLP on P.id = CLP.port_id
   WHERE P.package_name || '-' || CLP.port_version ~ 'p5-CGI.pm-[1-3].*') TMP;
 categoryport
---------------
 www/p5-CGI.pm

This queries all ports, all commits on those ports, and the revisions for each port from that commit.

It takes about 2 seconds.

This query takes about 1 second:

freshports.dev=# SELECT distinct categoryport(id) FROM
freshports.dev-#  (SELECT P.id, P.package_name || '-' || CLP.port_version AS release
freshports.dev(#     FROM ports P JOIN commit_log_ports CLP on P.id = CLP.port_id) TMP
freshports.dev-# WHERE TMP.release like 'p5-CGI.pm%';
 categoryport
---------------
 www/p5-CGI.pm
(1 row)


I think this might be the best solution for searching for a distinct match.

Everyone feel free to correct my thinking here:

###
Ideally, the result for this query would be static, never changing. But it can change as ports are added/removed.

Assuming FreshPorts runs the above query while processing a commit to www/p5-CGI, then the data
collected *could* be invalidated during the next commit.

But this only affects PKGNAME values. Thus, we only have to requery after a new port is added or
when a change occurs to the PKGNAME for a port.
###

We can requery in the background as required.

> Unless there's a PG extension that allows using glob(3) to match
> strings?  I can't see one after a pretty cursory search.  (sqlite has
> glob(3) support, which is what the pkg(8) command above is using under
> the hood.)

I found this: https://www.postgresql.org/docs/10/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP <https://www.postgresql.org/docs/10/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP>

It seems to be pretty close.

Thank you.

--
Dan Langille - BSDCan / PGCon
dan at langille.org


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 658 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.freebsd.org/pipermail/freebsd-ports/attachments/20180815/c1ab7f18/attachment.sig>


More information about the freebsd-ports mailing list