[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Please stop writing specifications that cannot beparsed/pr
Marcus Reichardt <u123724@gmail.com> writes: > ... >> The relatively deep intertwining of validation with everything else >> in ISO 8879 makes it hard to write even simple tools. > What would those tools be? Well, thinking back on the kinds of programs I wrote to process SGML data, or that I know were written by others, I'm thinking of things like: - a macro in emacs or Xedit or Kedit to close the current element. - a program to scan a document and report, for each element, its fully qualified generic identifier. (That is, a string like "/html/body/div/div/h2/b", listing the element and all of its ancestors, in document order, analogous to an absolute path in a file system.) - a program to search for a particular word or character sequence (assumed to be uninterrupted by markup) and report the fully qualified generic identifier of its parent. - a program to read an SGML document and emit a Waterloo GML document suitable for formatting and printing. - a program to read an SGML document and emit a TeX document suitable for formatting and printing. - a program to read an SGML document written using a literate-programming vocabulary tangle the source code. - a program to read an SGML DTD and make a list of element types referred to but no declared. - a program to read an SGML DTD and delete references to specified element types. For most of these, I had no SGML parsing library to call, because for what felt to me like a very long time there were no SGML parsers available on the mainfram I was working on. (It's possible that IBM had product that did SGML parsing, but from the descriptions I could find, I could not understand its functionality well enough to know whether it was worthwhile trying to persuade my management to acquire and install it.) Eventually, I was able to port James Clark's sgmls parser to VM/CMS and was able to use it, with CMS Pipelines, to simplify the creation of programs like those described above. Note that for any of the items above which I actually implemented, or tried to implement, I was interested in a program I could use; I was not attempting to make a tool others could use (although I would have been flattered at the idea that others might be intereted). Note also that not all of the items in the list really qualify as 'simple' tools. Nor are the two DTD processors necessarily easier today than in the late 1980s and early 1990s, or easier for XML DTDs than for SGML DTDs. > Aren't XML people usually the first to criticize ad-hoc kindof-XML > parsers, Perhaps some XML people -- not anyone I know well. If you are processing data you know well, and it's more convenient to process it with an ad-hoc Perl script, I think it's perfectly legitimate to cut corners. If the material you are working with never uses notations, you can save time. If the input you are trying to process uses no eneity references at all, you don't need to parse them. The scenario of someone responsible for a body of material writing ad-hoc programs to solve problems before a deadline of some kind was frequently referred to during the development of XML; the figure at the center was called "the desperate Perl hacker", sometimes abbreviated DPH. Of course, tools written by a DPH to solve particular problems, exploiting knowledge of a particular body of material, are not to be confused with general-purpose tools. And there are probably people who believe that conforming XML parsers are easy enough to write or acquire that there seems to be no very good reason to use a non-conforming parser in a tool intended for general use. (There are certainly such people; I am one. The DPH is processing the DPH's own data, not writing tools for others.) It may be noted that the DPH scenario resembles the situation members of the SGML community had often found themselves in, in the years 1986 to 1996, more than it resembles the situation most XML users find themselves in nowadays. Many programming languages have conforming XML parsers (although some only have parsers without any good claim to conformance), and it has been a long time since I wrote programs for XML or SGML input that work the way my Spitbol and Rexx programs -- or even my CML Pipelines -- worked. When I face the kinds of problems we imagined a desperate Perl hacker having to solve, or the tasks listed above, I am more inclined to write an ad hoc XSLT stylesheet or put together a quick and dirty XQuery module to solve the problem. I have the impression I am not alone. Since I program most frequently in XSLT and XQuery, I don't in fact have to write ad-hoc parsers for XML. And a lot of people write XSLT transforms who do not self-identify as programmers and would probably not ever have become Perl hackers of any kind, desperate or otherwise. But I think that aiming at the DPH had the beneficial side effect of helping the designers of XML keep the syntax simple, which does make it easier to write conforming XML software than it is to write conforming SGML software for less restrictive profiles of SGML, let alone unrestricted SGML. That did help make the XML ecosystem more populous than the SGML ecosystem we were then living in. > and why would you side-step a parser lib you just put a lot > of effort into creating, to use a task-specific ad-hoc kindof-XML > parser instead? Not everyone who faces the task of processing SGML or XML documents is in a position to write a parser library for SGML or XML. I think task-specific ad-hoc parsers are mostly written by people who have not just finished creating a full parser. I don't know how many of them are in fact written now. As you have observed, anyone in a position to use an off-the-shelf parser library will -- assuming the API is simple enough -- often find it more convenient to use an off the shelf parser. -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|