[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: RE: XML parsers use what computational power?
* Liam R E Quin wrote: >On Wed, 2013-04-10 at 09:43 +0000, Costello, Roger L. wrote: >> Resolving these entities: >> >> <!ENTITY ha1 "ha"> >> <!ENTITY ha2 "&ha1;&ha1;"> >> ... >> <!ENTITY ha128 "&ha127;&ha127;"> >> >> requires an amount of memory that is exponential to the number of entities. >> >> Therefore an XML parser requires more computational power than a linear bounded automata [1]. > >That does not follow at all. The problem here is that an "XML parser" is not required to actually produce any output on its own, it is assumed that there is some higher level application it is interacting with. The problem above occurs if, for instance, the machine writes out the input document with all the entities expanded onto its "tape", in other words, if it has to "call" some function of the higher level application and pass in the value of an attribute with all entities expanded as a "type position" and then the higher level application takes over. That is a common API, and it seems that this scenario does have the problem above, the "parser" is in need of more tape than what can be described as a linear function in terms of the input length, but that's an implementation choice; it would be fine if the "parser" component reports only small bits to the higher level application. Generally, it may be best to study the "complexity" of XML in terms of what is needed to determine whether an input string is a "well-formed XML document", under the assumption that the input string is encoded using a character encoding supported by the XML processor and the encoding isn't any more complex than the processor itself. >My understanding is that XML is at worst LL(2), not LL(n), in parser >complexity. So it can be parsed by a deterministic finite state >automaton, as described e.g. in the Dragon Book, the standard text for >such things (Aho, Hopcraft & Ullman). The 2 is because you may need two >tokens of look-ahead before you can reduce, in a shift-reduce parser, >although that's a supposition on my part. The EBNF grammar in the XML 1.0 Recommendation describes a context-free language, but the set of all XML 1.0 documents is not context-free, and since LL parsers recognize only a subset of the context-free languages, they cannot decide whether any given string is an XML document. Reasons for XML being not context free include many of the "WFCs" in the speci- fication (not all of them, e.g. numeric character references have to refer to a `Char`, and that is a regular problem not encoded in the EBNF grammar for reasons of readability among others); one example is that `<x a='' a='' />` is not an XML document, and you cannot express that attribute names must be unique in a context-free grammar, so XML is not LL. A simple way to look at this is thinking about how to make a generator for random well-formed XML 1.0 documents. If XML was context-free, you could write a grammar for XML 1.0 documents of the form Nx ::= (Nx or Tx)* ... where `Nx` is any non-terminal symbol (like `document` in the EBNF) and `Tx` is essentially a Unicode character, and `*` means "zero or more". You would construct a random document from that by picking `document` as start symbol, and recursively replace the right side by a random choice among the options given (for `or` you take one or the other, for `*` you might take zero or one or more). What you have already generated never factors into that, so you will end up generating duplicate attributes. -- Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|