|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] XML Torture Test: Parsers Fail
Without intending to do so, I have devised an XML document that exposes many problems in almost all XML validating parsers and non-validating parsers that resolve external entity references. You will find this torture test at http://metalab.unc.edu/examples/players/index.xml It has broken every parser I've thrown at it in one way or another including the one in IE5 with the single exception of RXP. However RXP reports some warnings that do not appear to be errors, and missed some problems involving the lack of encoding declarations in the text declarations in an earlier version that xml4j 2.0.4 (but not 1.1.14) picked up. These have now been fixed. As best I can tell this document is both well-formed and valid. It's hard to say for sure when many different parsers all fail to process it, mostly after either giving up completely or generating incorrect error messages. Until I'm more confident the document is correct, I'm simply defining a broken parser as one that 1. describes a valid documbent as invalid (Microsoft?, xml4j?) 2. describes an invalid document as valid (RXP) 3. describes an invalid document as invalid but gives the wrong reason. (Microsoft?, xml4j?) Once I've conclusively determined whether my document is valid, I should be able to determine whether Microsoft, xml4j and xml4j fit into category 1 or 3 or both. What's torturous about this example is that it defines over 1000 separate external general entity references in several dozen different DTDs. Currently only one of those entities is actually used in the main document, but I plan to expand it to use all 1000+ entities. Thus it's likely to become even more difficult to parse properly. Leaving aside the question of whether this is the proper design for this document, it's nonetheless the case that parsers should be able to handle it. Parser authors may wish to investigate further. The assistance of anyone who can spot by eye mistakes I made that the parsers may be incorrectly reporting is appreciated. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@m... | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








