[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Please stop writing specifications that cannot beparsed/p
Marcus Reichardt <u123724@gmail.com> writes: > FYI item 4 in the list of goals of XML is > "It shall be easy to write programs which process XML documents." > > What is meant by "easy to write programs processing XML documents"? To > implement an XML parser from scratch? Among other things, yes, ease of writing parsers was on the minds of those who defined XML. > In that case, I guess it's relatively safe to say this is neither very > easy nor relevant since, fortunately you might say, nobody is creating > XML parsers from scratch. I disagree on both counts. Of course, like many things, the task of writing an XML parser may prove more complicated and to involve more subtleties than it looked like at first. But in many programming languages, the hardest part of writing an XML parser is supporting Unicode properly, which is worth doing anyway. > But when using a parser library anyway, then processing XML > is exactly as complicated as processing SGML since the parser lib does > the heavy lifting, and emits just the same SAX events in both cases. I think this is empirically false. In 1988, if I remember correctly, I heard a well known computer scientist explain why his project used an SGML-like syntax and not SGML. If you hand Kernighan and Ritche to a graduate student, he said, they transcribe the grammar and a couple of days or hours later they have a parser for C. When you hand the SGML specification to a graduate student, they transcribe the grammar and a couple of hours later they have 179 (or some equally crazy number) reduce/reduce conflicts. And a week later, they have wrestled it down to 38 reduce/reduce conflicts. Still no parser. It's no wonder there are so few SGML tools, he said. The spec goes out of its way to put unnecessary barriers in the way. In 1996, those working to define XML said, informally, that a reasonably competent graduate student should be able to write a correct XML parser in about a week. During the course of our work, we heard from an undergraduate in Austria that in his case it had taken two weeks. If I remember correctly, he said it took longer than he had hoped because his implementation language had only spotty Unicode support. But possibly that's a false memory. Since markup minimization in SGML and what ISO 8879 calls the 'ambiguity' rule are so unlike any standard concepts in off-the-shelf parsing tools, they require a good deal of special coding. I could easily be missing something, but I am unaware of anyone who has developed a conforming SGML parser using only standard off-the-shelf parser generators. (Certainly I could not do so.) And in the first ten years of SGML's being a standard, only a handful of conforming parsers had been produced. I believe it's safe to say that XML had more conforming parsers than SGML within ten weeks of being a W3C Recommendation. > From what I gather by eg [1], "easy to implement" comes from a hope > that there could be more than a single implementation. Fortunately, > that has been taken care of for SGML now ;) Yes, those of us who used SGML at that time chafed under the scarcity of SGML software, and we hoped that there would be many many more programs for XML than there were for SGML. "More than one" implementation was not the bar we set. Michael -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|