[PATCH] docproj port needs to use tidy-devel

Murray Stokely murray at stokely.org
Fri Jan 25 18:09:19 UTC 2008


On 1/25/08, Gábor Kövesdán <gabor at freebsd.org> wrote:
>
> First, sorry for the late answer. Not just the xhtml, but the html
> output of tidy is incorrect as well, it does not validate. (I think
> www/63552 is related, because without tidy, such errors don't appear.)
> But, the newer tidy versions completely mess up character sets. They
> mess the Hungarian characters set surely, but I suspect there are
> others, too. The only reason that we don't disable it in the Hungarian
> project is that builder has an ancient version, which works fine.
> Besides, different versions of tidy have different set of command line
> options, which makes our toolchain less portable.
> But anyway, why we do really need tidy? I made some tests before without
> tidy and the only thing that I had to do for generating valid pages was
> to reinplace-edit the DTD. As sgmlnorm outputs our custom DTD, the
> webpages were not valid, but after replacing them with HTML 4.1
> Transitional DTD, everything validated. I'd prefer see it go away.
> Yes, I know that one reason for tidy is the indenting and line breaking
> in HTML code, the output of sgmlnorm is not for human consumption. But
> cannot we do that in a simpler way?


xsltproc can output nice .html with line breaks and indentation.  For
example I use this for the RSS feeds to make it
nice and human readable without going through tidy :

<xsl:output method="xml" indent="yes"/>

One more idea, which came to my mind about this. Currently, our webpages
> are not uniform. We use HTML 4.1 for our pages generated from .sgml and
> XHTML 1.1 for .xsl output. What do you think about using XHTML 1.1
> uniformly? Obviously, sgmlnorm cannot do that, but there are advantages


Yea, that's a low priority could/should be done sort of item.  I would focus
first on any pages that actually don't validate or where you want to add
some xml feature that can't currently be accomplished with the older sgml
based pages.  Updating old content / adding new content to the Handbook or
something I think would be even more useful if you have the time.

As a result, I think it would be a good idea. Maybe it would be a good
> SoC project for me to polish the pages in this way as I'm interested, I
> want to learn more XML stuff and I want to participate in the upcoming
> SoC again. Another item would be to bring the doc repo to DocBook5 / XML.


Web projects like this I think aren't the main intent of the summer of code
program.  We had one project in this area in 2005, and Emily did an
excellent job with it writing a LOT of xslt code for us and completely
redesigning the web site, but converting the remaining sgml to xml isn't
really a good fit with the summer of code program.

But by all means, please do convert any individual SGML files to XML if that
is where your interests lay.

                    - Murray



More information about the freebsd-doc mailing list