|
next
|
Subject: Re: XML Mapping to SGML DTD Author: (Deleted User) Date: 14 Sep 2002 12:04 AM
|
Hi Jeff,
the file 87269.dtd is not a valid DTD according to the XML specs, and
Stylus Studio only supports DTDs as defined in
http://www.w3.org/TR/REC-xml. The file can be a valid DTD according to the
SGML specs, but XML has been defined as a subset of the SGML grammar,
removing features that made SGML quite unpractical to be handled/validated.
Looking at this file in particular, SGML allows the definition for an
external parameter entities like
<!ENTITY % dietmdb-a PUBLIC "-//USA-DOD//DTD Content Data Model Generic
Layer//EN" >
while the XML specs always require a "system id" to be specified, like in
<!ENTITY % dietmdb-a PUBLIC "-//USA-DOD//DTD Content Data Model Generic
Layer//EN" "dietmdb-a.ent">
****
the relevant BNF from the XML schema spec is
[70] EntityDecl ::= GEDecl | PEDecl
[72] PEDecl ::= '<!ENTITY' S '%' S Name S PEDef S? '>'
[74] PEDef ::= EntityValue | ExternalID
[75] ExternalID ::= 'SYSTEM' S SystemLiteral | 'PUBLIC' S PubidLiteral S
SystemLiteral
****
Also, you use a "- O" inside the definition of elements, but this is not
allowed inside an XML DTD.
****
[45] elementdecl ::= '<!ELEMENT' S Name S contentspec S? '>' [VC:
Unique Element Type Declaration]
[46] contentspec ::= 'EMPTY' | 'ANY' | Mixed | children
[47] children ::= (choice | seq) ('?' | '*' | '+')?
[48] cp ::= (Name | choice | seq) ('?' | '*' | '+')?
[49] choice ::= '(' S? cp ( S? '|' S? cp )+ S? ')'
[50] seq ::= '(' S? cp ( S? ',' S? cp )* S? ')'
[51] Mixed ::= '(' S? '#PCDATA' (S? '|' S? Name)* S? ')*' | '(' S?
'#PCDATA' S? ')'
****
Hope this helps,
Alberto
|
top
|
Subject: Re: XML Mapping to SGML DTD Author: (Deleted User) Date: 24 Sep 2002 04:16 AM
|
Hi Jeff,
here is a list of the conversions I did; please note that the DTD obtained
by running these steps is not strictly equivalent to the source SGML. The
XML specs have greatly reduced the expressivity of DTDs with the aim of
making them less complex, so the result DTD would validate data files that
the original DTD would reject as invalid.
1) definition of external parameter entities in SGML doesn't have the
"system ID". So, you need to add the location of the file to be included
ENTITY % dietmdb-a PUBLIC "-//USA-DOD//DTD Content Data Model Generic
Layer//EN"
must become
ENTITY % dietmdb-a PUBLIC "-//USA-DOD//DTD Content Data Model Generic
Layer//EN" "ietmdba"
2) element definition has a couple of character chosen between "-", "O" and
"o" between the name and the children model, that you need to remove.
For example,
ELEMENT techinfo - - ( version+, (%system;)+ )
must become
ELEMENT techinfo ( version+, (%system;)+ )
3) SGML allows to put comments inside the definitions of entities and
element, between "--" markers, that you need to remove.
For example,
ENTITY alpha "[alpha ]" --Greek letter lowercase alpha --
must become
ENTITY alpha "[alpha ]"
4) in SGML you can define multiple elements in the same declaration, like in
ELEMENT (sup | sub) (%f.text; | %f.oper;)+
You need to create one ELEMENT definition for each element
ELEMENT sup (%f.text; | %f.oper;)+
ELEMENT sub (%f.text; | %f.oper;)+
5) DTD introduced the restriction that, when an element has #PCDATA nodes
as children, the definition for this element must be in the form (#PCDATA |
child1 | child2 ....)*
So, every time the SGML definition involves a #PCDATA element, you need to
rearrange it so that it has this form. BTW, this new form is less
restrictive, like in this example
ENTITY % f.text "#PCDATA | roman | italic | ov"
ENTITY % f.oper "mark | markref | break | sup | sub | sum | integral |
product | plex | frac | diff | sqrt | root | square | power | pile |
matrix | fence | middle | tensor | mfn | box | vec"
ELEMENT mfn ((fname , of) | (%f.text; | %f.oper;)+)>
that must be changed into
ELEMENT mfn (%f.text; | fname | of | %f.oper;)*>
6) SGML has more attribute types, e.g. NUTOKEN, NUTOKENS, NAME, NAMES; the
first two should be mapped to NMTOKEN and NMTOKENS, the last ones should be
changed into plain CDATA. It also has a modifier, #CONREF, that should be
changed into #IMPLIED. Also, the default value for enumerated values should
be placed between quotes.
Hope this helps,
Alberto
|
|
|
|