Validating docbook articles...

Dag-Erling Smørgrav des at des.no
Tue Feb 24 07:07:39 UTC 2004


Chuck Swiger <cswiger at mac.com> writes:
> ...whereas not using a SystemLiteral with the DOCTYPE declaration
> works fine with nsgmls but xmllint refuses to parse the document.  Am
> I wrong in concluding that by requiring a SystemLiteral for a document
> that is valid SGML, XML fails design goal #3, aka "XML shall be
> compatible with SGML"...?

a well-formed XML document is also a well-formed SGML document, but
the reverse need not be true; and the DTD syntax is different.

> Entity: line 5: parser error : Entity 'trade' not defined
>    designations have been followed by the <quote>™</quote> or the
>
> Entity: line 6: parser error : Entity 'reg' not defined
>    <quote>®</quote> symbol.</para>
>                ^

These are HTML entities.  You need to include their definitions in the
DOCTYPE block:

  <!ENTITY % HTMLlat1
    PUBLIC "-//W3C//ENTITIES Latin 1 for XHTML//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent">
  %HTMLlat1;
  <!ENTITY % HTMLspecial
    PUBLIC "-//W3C//ENTITIES Special for XHTML//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent">
  %HTMLspecial;
  <!ENTITY % HTMLsymbol
    PUBLIC "-//W3C//ENTITIES Symbols for XHTML//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent">
  %HTMLsymbol;


> Entity: line 6: parser error : chunk is not well balanced
>    <quote>®</quote> symbol.</para>
>                                       ^
> article.sgml:33: parser error : chunk is not well balanced
>        &tm-attrib.general;
>                           ^

Probably unclosed tags.  SGML allows them (depending on the DTD), XML
does not.

> article.sgml:210: parser error : Entity 'prompt.root' not defined
>      <screen>&prompt.root; <userinput>sysctl net.link.ether.bridge.config=fxp0:0,
>                           ^
> article.sgml:211: parser error : Entity 'prompt.root' not defined
> &prompt.root; <userinput>sysctl net.link.ether.bridge.ipfw=1</userinput>
>               ^
> article.sgml:212: parser error : Entity 'prompt.root' not defined
> &prompt.root; <userinput>sysctl net.link.ether.bridge.enable=1</userinput></scre
>               ^
> article.sgml:219: parser error : Entity 'nbsp' not defined
>        <para>If you have &os; 5.1-RELEASE or previous the sysctl variables
>                                   ^

These come from failing to declare entities

> This has been interesting, but it's demonstrably non-trivial to
> convert SGML docbook articles into XML.  More specificly, I don't see
> how to do so for a particular article without making non-local changes
> to .ent files being referenced by the article in order to make the XML
> version work at all, and I don't see how to make both nsgmls and
> xmllint happy at the same time.

The DTD syntax is slightly different, but it should be easy to convert
entity declarations mechanically.

Backward compatibility with the SGML toolchain is not, IMHO, required
or desirable.

DES
-- 
Dag-Erling Smørgrav - des at des.no



More information about the freebsd-doc mailing list