database tables for VuXML
Dan Langille
dan at langille.org
Tue Aug 24 14:38:57 PDT 2004
On 22 Aug 2004 at 14:29, Jacques A. Vidrine wrote:
> [Added freebsd-vuxml@]
>
> On Sat, Aug 21, 2004 at 07:11:21PM -0400, Dan Langille wrote:
> > Hi folks,
> >
> > I'm in the early stage of incorporating VuXML information into
> > FreshPorts.
>
> Alright! That'll be cool.
>
> > I have a preliminary database table design which appears
> > below. If you think I've missed any essential information out of
> > this design, please let me know.
> >
> > First: has anyone already worked on creating database tables for the
> > VuXML data?
>
> I have an old prototype application which uses SQLite. FWIW, here is
> the schema I used then. (the formatting is a little funny because this
> was embedded in some other code)
>
> CREATE TABLE vuxml (
> vid VARCHAR PRIMARY KEY UNIQUE
> , topic VARCHAR
> , description VARCHAR
> , discovery DATE
> , entry DATE
> , modified DATE
> -- computed max(entry, modified) for convenience in
> -- other expressions
> , last DATE
> );
This is pretty much identical to what I have.
>
> CREATE TABLE refs (
> vid VARCHAR
> , type VARCHAR
> , text VARCHAR
> );
Ahh, yes, that repeating group. I'm not storing that. I should be
> CREATE TABLE affected (
> vid VARCHAR
> , type VARCHAR
> );
What is this type field? I didn't follow what the affected table
contains.
> CREATE TABLE names (
> affected INTEGER
> , name VARCHAR
> );
>From your example below, the names table would have entries for mutt
and ja-mutt. I think I'll do that as well. See the end of my
message for how I plan to relate VuXML entries to specific ports.
> CREATE TABLE ranges (
> affected INTEGER
> , lowop VARCHAR
> , low VARCHAR
> , glow0 INTEGER
> , glow1 INTEGER
> , glow2 INTEGER
> , glow3 INTEGER
> , highop VARCHAR
> , high VARCHAR
> , ghigh0 INTEGER
> , ghigh1 INTEGER
> , ghigh2 INTEGER
> , ghigh3 INTEGER
> );
>
> The `ranges' table perhaps looks erm interesting. In this case,
> I encoded each package version number into four 32-bit integers.
> This allowed one to lookup an affected package completely in SQL.
> However, the encoding I used may now be incorrect in some ways, since
> pkg_install et al were recently changed to treat version numbers
> differently.
FreshPorts stores PORTVERSION and PORTREVISION as separate text
fields. I take it that the ranges equations should compare only
PORTVERSION and ignore PORTREVISION.
> Also, SQLite could not really optimize the complex SQL
> statement that resulted. So I think it is interesting for further
> study, but I doubt I'd do it that way again unless I had a very
> particular need to use pure SQL or the database was huge.
SQLite is a great little database. I like what I've seen of it
(mostly via my work on Bacula). I'll be using PostgreSQL for this.
>
> > create table vuxml
> > (
> > id serial not null,
> > -- internal FreshPorts ID
> > vid text not null,
> > -- the vuln vid. e.g. c4b025bb-f05d-11d8-9837-000c41e2cdad
> > topic text not null,
> > description text not null,
> > date_discovery date ,
> > date_entry date ,
> > date_modified date ,
> > status date not null,
> > primary key (id)
> > );
> >
> > create table vuxml_port_xref
> > (
> > id serial not null,
> > vuxml_id integer not null,
> > port_id integer not null,
> > primary key (id)
> > );
> >
> > create table vuxml_port_xref_range
> > (
> > id serial not null,
> > vuxml_port_xref_id text ,
> > range_operator_start text ,
> > range_operator_end text ,
> > range_version_start text ,
> > range_version_end text ,
> > primary key (id)
> > );
> >
> > If you consider the output from: vxquery -t vuxml ~/VuXML/vuln.xml
> > tnftpd, the entry in vuxml_port_xref_range for tnftpd might be:
> >
> > (1, 'lt', 20040810, NULL, NULL)
>
> Seem reasonable. For what its worth, storing the range operators as
> `>', `=>', etc lets you get something human readable as the output of a
> SELECT, e.g.
>
> SELECT range_version_start, range_operator_start, port_name,
> range_operator_end, range_version_end FROM ... WHERE ...;
>
> ('20040810', '<', 'tnftpd', '', '')
>
> I found that handy for manual inspection of the database.
Whatever I store, need to be able to run queries upon it. I was
thinking of keeping the existing values as that might be easier with
perl. Not sure yetp.
> You may notice from my example that I used separate `name', `ranges',
> and `affected' tables. I found this easier when filling the database,
> since the <package>/<system> elements are kind of a cross-product
> operator. For example, when processing something like
>
> <package>
> <name>mutt</name>
> <name>ja-mutt</name>
> <range><ge>1.4</ge><lt>1.4.2</lt></range>
> </package>
This is what I do with my vuxml_port_xref table. That table relates
a vid to a port.
> I first created an `affected' entry to contain the other references.
> Then, as I hit each `name' I just associated it with the `affected'
> entry, and as I hit each `range' I did the same. (Otherwise one
> needs to construct a few lists, and then make all entries once
> the </package> end tag is seen.) I think this is also easier to
> modify if we get new children of <package>, e.g. the <category> and
> <architecture> tags that are currently in the VuXML pre-1.2 DTD.
My plan is to populate empty the vuxml_* tables each time there is a
commit to the data file. This keeps the vuxml system totally
separate from FreshPorts.
Each row under Commit History (e.g.
http://beta.freshports.org/sysutils/bacula/) relates to a row from
the commit_log_ports table. That looks something like this:
create table commit_log_ports
(
commit_log_id integer not null,
port_id integer not null,
needs_refresh smallint not null,
port_version text ,
port_revision text ,
primary key (commit_log_id, port_id)
);
I'll probably create another table commit_log_ports_vuxml:
create table commit_log_ports
(
commit_log_id integer not null,
port_id integer not null,
vuxml_id integer not null ,
primary key (commit_log_id, port_id)
);
Looking at the current data, there's about 140 affected ports, but I
haven't broken that with respect to ranges, which is what the above
will do.
I'll either do that, or add the vuxml_id column to the
commit_log_ports table, but I'd rather keep it separate. Time will
tell.
--
Dan Langille : http://www.langille.org/
More information about the freebsd-ports
mailing list