[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Re WF, V, and MSXML
In message <199706082339.QAA08654@b...>, Terry Allen <tallen@s...> writes >| >| > But for an XML parser, the boundaries are shifted, because >| > it has to deal with an XML document that *includes* the prologue >| > (XMLlang production 23, where "element" corresponds to the SGML >| > "document instance set", I think). I don't know whether this is a good >| > idea or not, just trying to understand it as an early adopter. I don't see _any_ difference between SGML and XML on this front. SGML parsers also have to deal with the prolog: the formal syntax of an "SGML document entity" is: S SGML declaration, prolog, document instance set, 'entity end' signal (so in fact they also have to deal with the SGML declaration as well!) The fact that the default ESIS output from the parser doesn't include any DTD-related information shouldn't be taken to mean the parser hasn't processed this information. >| I am actually unclear whether a WF-only parser (e.g. Lark) has to read the >| internal subset at all, other than skipping to the ']>' at the end. If it >| *does* read and parse it, what does it do with the information. For example, > >The soft spot here is the first line of 2.2, where "match" is not >defined except that later in that section it "implies" a few things, >which are not apparently meant to be a complete set. What the >WF document matches is production 23, Prolog element Misc*. As >the processor attempting to determine WFness must look inside element to >determine WFness, presumably the same is true of prolog. > > ... unless I determine WFness by *parsing* with a *real parser* which >the processor is not meant to be ... I would read the existing XML spec in a stricter spirit than you have done. To me, "match" means just that, i.e. that _if_ a WF document has an internal or an external DTD, these should be parsed as though for a valid XML document. Any _syntactic_ errors in the DTD should be flagged, even in 'WF' mode. (Bear in mind that no-one is forcing WF documents to have a DTD at all, except for entity declarations.) If you try to adopt a 'don't care' mode of parsing for the DTD when dealing with WF documents, you probably create many more problems than you solve. The only difference is the use that is made of the DTD information: in a WF document only the entity declarations matter to the parser. >| what is the implied structure of the document in: >| >| <!DOCTYPE FOO [ >| <!ATTLIST FOO XML-LINK CDATA #FIXED "SIMPLE"> >| ]> >| <FOO HREF="bar"/> >| >| Can we assume that FOO (which has no Element declaration) has an ATTLIST as >| given, and that therefore it inherits the SHOW and ACTUATE attributes? >| IOW *must* a parser decorate all matching elements with the ATTLISTS in the >| internal subset? > >No, not per XMLlang alone. FOO's only declared attribute has as its name >the unreserved string "XML-LINK" although it uses an undeclared attribute >name "HREF". So it is WF but not valid. .. and since it is only well-formed and not valid, it cannot (in my view) partake in any operations that require knowledge of <!ELEMENT or <!ATTLIST declarations. IOW, XML-LINK is not relevant to WF documents ...? Richard Light SGML and Museum Information Consultancy richard@l... 3 Midfields Walk Burgess Hill West Sussex RH15 8JA U.K. tel. (44) 1444 232067 xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@i... the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|