[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Keeping ISO 8879 Alive (was RE: Markup perspective not code)
8/2/2002 9:48:40 AM, "Bullard, Claude L (Len)" <clbullar@i...> wrote: >Keep ISO 8879 alive. It is ISO that guarantees that markup >is the property of the commons. I'm not sure I agree with Len's characterization of the W3C or the intelligence quotient of those who think that XML should move beyond its SGML roots <grin>, but I do agree with the importance of keeping ISO 8879 alive. In fact, I'm beginning to think that both SGML and XML need to be living specs, and some of us need to simply choose which community we belong in. (Many of us will belong to both, and that's fine too). There certainly is a "documents vs data" cleavage, but IMHO the more crucial differentiator between the two communities is whether they, deep down inside, think of XML as "markup" or "infoset". Markup people can be minimalists who want to keep things very simple and monastical, or they can be hard core SGML geeks who know how to exploit the bells and whistles of 8879. But at heart they think of "XML" as *text* to be written and marked up by a human and crunched by a variety of tools that can utilize the markup to produce material that will be consumed by humans. They seem to prefer processing tools such as SAX that let them stay close to the syntax. They certainly use XSLT, but tend to think of it as operating at the syntax level, I suspect, and get irritated when it throws away their CDATA sections. They care deeply about syntactical details and changes (XML 1.1, the global/local attribute namespace thread, etc.) because they HAVE to care. They have certainly benefitted from the spoils of the "XML Revolution" but would have been just as happy if the DOM, InfoSet, XQuery, etc. abstractions away from the syntax had never been invented. "Infoset" people can also be minimalists, or they may have stopped worrying and learned to love the PSVI, the XQuery type system, etc. They tend to be agnostic about how some input stream was produced, whether by a person, a program, a serialization of some object or database, etc. At heart they think of "XML" as some text or data that can be readily mapped to one flavor or another of the XML data model (of which there are at least 4 by my count: W3C InfoSet, DOM, XPath, and XQuery... maybe JDOM counts as another, I'm not sure). They tend to prefer processing tools that abstract the structure away from the syntax (e.g. DOM/JDOM, XPath, XQuery). They also use XSLT extensively, but conceieve of it as a data-model to data-model transformer and may combine XSLT processing with DOM-ish programming. They are perfectly happy with the redeformulation of specs such as SOAP from a syntax definition to an InfoSet definition, seeing the potential for specialized serializations as greatly outweighing the problems of deviating from the One True Syntax, because they really think of syntax as a detail that parsers and serializers worry about. So, one way forward to avoiding endless, fruitless debates on XML-DEV, IMHO, is to accept the fact that we are not one community anymore. We have a lot in common -- the XML 1.0 syntax as a "canonical form", for example ... and XSLT. As I said, many of us can happily live in either camp, switching from one to another as the job requires. But sometimes we just must agree to disagree, proudly stating "I am a syntax person" -- I have to care, so quit telling me I shouldn't", or "I am an infoset person" -- I don't give a rat's patootie about details of syntax, so quit trying to make me feel guilty about it. Another way forward, which I doubt if many people will agree with, might be refactor things along SGML "markup for authors" and XML "infoset for programmers" lines. Agree on a basic syntax for XML 2.0 that removes most the stuff that the infoset throws away and causes the DOM (which basically tries to live in both the syntax and InfoSet worlds) fits, such as DTDs, entities, entity references, CDATA sections. That's not to say that people should stop using entities and CDATA sections, just to say that they "properly" belong in the SGML world where "syntax sugar" is respected and supported. Or if it is too confusing and politically unworkable to consign this stuff to the ISO 8879 community, separate out an XML "preprocessor" that understands all the author-friendly syntax sugar coating but isolates it from the hard-core element/attribute/text parsing into an InfoSet.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|