[CALL FOR REVIEW] doc and www converted to XML

Simon L. B. Nielsen simon at FreeBSD.org
Tue Aug 21 19:54:46 UTC 2012


On 20 Aug 2012, at 17:48, Gabor Kovesdan <gabor at FreeBSD.org> wrote:

> Dear Folks,
> 
> I'm glad to announce that the first milestone of the XML migration is
> available for review in the projects/sgml2xml branch. To check it out,
> run the following:
> 
> svn co http://svn.freebsd.org/doc/projects/sgml2xml sgml2xml
> 
> The build process - from the end user perspective - works in the same
> way. In short, use make all at the proper place, to build only web, run
> make all WEB_ONLY=yes in the htdocs dir, etc. Then use make install with
> DESTDIR defined to install files to the proper place.

We really should fix this (the magic of en/htdocs building other languages too etc; DESTDIR not meaning what it does in base etc.), but that can be done later.

> A rendered version of the website is available here:
> http://people.freebsd.org/~gabor/xmlweb/data/
> For the documentation, you can directly go to:
> http://people.freebsd.org/~gabor/xmlweb/data/doc/
> 
> This branch includes the following changes:
> - - Documentation is updated from DocBook 4.1/SGML to DocBook 4.2/XML
> - - Webpages are updated from HTML 4.01 Transitional to XHTML 1.0
> Transitional

I looked at a random page, and the identation for the header is a bit funny. Is that just an artifact of an automatic conversion or? Example: http://svnweb.freebsd.org/doc/projects/sgml2xml/en_US.ISO8859-1/htdocs/logo.sgml?revision=39396&view=markup

> - - Static webpages are now processed by XSLT behind the scenes
> - - Webpages are now built with less cycles; tidy has been removed and the
> date processing is now done by XSLT

Yay. tidy die die die :-).

> - - Generated webpages are now actually valid (they did not use to be)
> - - All XSLT stylesheets now pull in a main XSLT, which reduces duplicated
> markup
> - - Site map and index are converted to an XML format with an XSLT
> transformation that generates the output
> - - For docs, there is now only one entity set for both articles and books
> - - Some trademark/legalnotice entities have been merged to a cohesive
> single entity file
> - - Untranslated entity sets are now always pulled in from the English
> tree instead of redundant copies
> - - The base and enbase entities are already automatically generated so
> remove inline definitions from individual files
> - - Fetch the LEGAL file via http instead of depending on CVS

If you are going to change it, could you please change it to use svn and with a REPO path we can set from the web build wrapper? I would REALLY like if we could get the web build fixed to never try to get data from the internet.

> - - Convert id names to lowercase to avoid mixing different styles and for
> better readability
> - - All PSGML comments are removed since they are mostly useless
> 
> As it has been discussed, the character entities will be dropped. This
> is still in progress but it is already a good moment for the rest to be
> reviewed since it is a big change that needs proper review and testing.
> At the same time, this also means that it is not easy to maintain such a
> big changeset in a branch since merging so many files is really
> time-consuming so it would be beneficial not to spend more time with
> merging this back than necessary. I would like to ask you to review this
> changeset and let me know any type of problems you encounter or any type
> of doubts you have. It would be nice if all translator projects could
> check their translations to see if there is any locale-specific problem.
> 
> Despite the big quantity of the changes, the modernization process of
> the doc tree is not complete with this change. First, we still use Jade
> and DSSSL to generate output, which is an SGML tool and works because of
> the fact that XML is a subset of SGML. But it does not really benefit of

Do that mean that the current build dependencies are unchanged?

> XML technologies and the DocBook DSSSL stylesheets are quite obsolete.
> In a second step, we should migrate to an XSL(T)-based toolset.
> Secondly, the DocBook 4.2 schema is quite old, the current DocBook
> version is 5.0. But 4.2 is the first XML version and it still works well
> with the old DSSSL stylesheets so this was a safe migration path that
> gives us more time for the migration and for QA. Once this branch is
> merged back, the migration of the toolset will be started in another
> branch.

Sure, I think it makes a lot of sense to do that separately later. Smaller steps makes it much simpler to test verify etc.

> Thanks in advance for your review.

Thanks for working on this! We were talking on doing this when I was a new doc committer :-).

PS. sorry for not following up on the previous mails wrt. XHTML etc... limited time unfortunately.

-- 
Simon L. B. Nielsen




More information about the freebsd-doc mailing list