[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] XML and ad-hoc syntax (was: Re: Please stop writingspecificati
I second most of what Norm said in his mail, and want to comment further on one point. Norm Tovey-Walsh <ndw@nwalsh.com> writes: > Dimitre Novatchev <dnovatchev@gmail.com> writes: >> ... >> I envy GitHub authors who only have to use MD, and can easily produce >> stunning documents. > > ... > > If you’re willing to invent arbitrary amounts of ad hoc syntax, and edit > that syntax in a text editor with no understanding of the syntax (or > write a customized editor, I suppose), it’s probably possible to design > a Markdown-style syntax that would capture the structure of, for > example, the QT specifications, but *BOY* it would not be pretty. (If > you think I’m mistaken, I invite you to propose a MD style grammar that > will capture the information necessary to generate them. You get zero > credit for 80% of the job. The first 80% is easy. It’s a zero-sum > challenge, succeed or fail, there is no try.) In connection with the choice between inventing an ad hoc syntax and just using XML, I offer two data points. (Or three, I guess.) For what they are worth. - During the development of XSLT 2.0 and 3.0 and XQuery 1.0 and 3.*, it was noticeable that every time new functionality was added to the design, the course of the discussion depended a lot on whether the new functionality was to be part of XPath, or XQuery, or XSLT. If the new functionality was going into XSLT, almost all the time on an issue went to discussing how the functionality should work. Extending the syntax of the language to express the new functionality never took any appreciable amount of time, because adding a new attribute or element never risked introducing ambiguity or lookahead problems. Of course, deciding what attributes and elements to add or change required some care and thought, but it never became the kind of roadblock that it routinely became in the XQuery WG. For XPath or XQuery, the discussion of functionality would take the same amount of time as in XSLT, and then additional time would be needed to work out the syntax of the new functionality -- maybe the same amount of time again, maybe as little as half the time spent on functionality. It was not unusual to have to iterate multiple times, because the WG members who maintained the grammar would report back that the current syntax proposal was incurably ambiguous or otherwise problematic. (And every now and then the WG would shoe-horn the problematic syntax in anyway, by adding a new ad hoc rule for the tokenizer. The result is, for connoisseurs of formal grammars, a bit of a mess.) - A year or two ago, some of those working on invisible XML discussed how to build test suites for ixml. We discussed whether to base our work on some existing test framework with a non-XML syntax (I can't remember the name) or write the test catalog in XML. There was some sentiment for starting from the non-XML syntax -- after all, this is invisible XML, so we can turn it into XML whenever we want, right? The first problem came when I tried to write my first test catalog using the non-XML syntax. The existing system had very sparse metadata and had no hooks for any of the kind of information I think is helpful for communally maintained test suites (like: a change history). I tried to figure out a nifty way to extend the existing non-XML syntax to handle the information needed, and after a few hours of banging my head against the wall I gave up and wrote the catalog in XML. Figuring out the XML representation of the test catalog then involved mostly thinking about the kind of information needed and its structure, and a little bit of thinking about what to call things and how things should nest. No one has ever said "Oh, but wouldn't it be nicer to have a non-XML syntax for the catalog?" In both of these cases, non-XML syntaxes proved harder to work with than XML syntaxes. That's not always the case, surely: many people whose goals for their documents are limited to getting ink on paper or lighting up pixels on screens are happy with Markdown. And for some complicated information structures, I think designing an ixml grammar can be helpful -- indeed, at Balisage this year I will be reporting on one such case. So let's add a third data point: - When I was transcribing Gottlob Frege's 1879 book on 'concept notation' (Begriffsschrift), I experimented briefly with representing his two-dimensional logical notation using an XML syntax. One might, for example, use something like SVG to talk about the two-dimensional shapes of the formulas. Or one might use an XML syntax for logic to represent the logical structure expressed by the formuas. But I found that transcribing even a relatively simple formula involved an awful lot of machinery. And once I actually understood Frege's notation reasonably well, I could see that a simple, easily keyboardable syntax could be devised which would make it easier to capture his visually and logically complex structures. So I devised such a keyboardable syntax and wrote an ixml grammar for it, and the transcription proceeded without incident. It would have been better if there had been an easy way to get syntax support in the editor, to detect syntax problems in the transcribed formulas. Detecting them by parsing the formulas to XML and then converting them to SVG took longer than detecting an XML validity error in a schema-aware editor. But on the whole I think the use of ixml was a big success here. I am not completely sure what factors make XML syntax better in one case and non-XML syntax in another. My guess is that one factor is that Frege's notation really is quite specialized. So there are a lot of things that may happen in the universe of technical writing, logic, or mathematics which my ixml grammar does not need to handle. All of that is handled by the surrounding XML: the non-XML syntax is using only within 'formula' elements. (Of course, reality sometimes bites back: in reality, there are several cases in the book which require text-critical markup to record variant readings in different editions of Frege's book. For that, there is no easy non-XML markup.) Another factor is the stability of the target: extending an XML vocabulary is easy, and extending a non-XML syntax is hard, partly because we have very few good tools for managing changes in a grammar. So it was the addition of new functionality, or new metadata requirements, that made non-XML syntaxes so much work in the first two cases, and the stability of the target that made ixml work well in the third case: Frege's notation in the 1879 book is not going to change any more, so the case of having to add new functionality is just not going to arise. And, of course, the ixml case works well in part (at least for me) because what the parser produces is XML, which I can process using a fairly capable technology stack. I'm curious about other people's experiences. As always, YMMV. -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|