RE: Still not the essence of XML (was Re: S-expressi
S-exprs are equivalent to XML for a large class of XML users. First of all if you think about why most people use XML you'll realize it's mostly because of the network effect and not anything intrinsically fantastic about angle brackets. In working with XML I've seen two broad classes of XML users those who want to represent structured data [database exportation and unified query, web services, RSS feeds, config files, etc] and those who actually want to mark up content. Most of the former do not need XML features like PIs, entities or encodings and in fact tend to shun them. Secondly, from my experience working with XML there are a lot more data-centric users of XML than there are markup/document centric users. From that perspective, most users of XML could just be as well served with S-expressions which for all intents and purposes would be equivalent to XML for their use cases. PS: I do think that Wadler's paper is extremely mistitled. -----Original Message----- From: Rick Jelliffe [mailto:ricko@a...] Sent: Sat 1/11/2003 12:40 AM To: xml-dev@l... Cc: Subject: Still not the essence of XML (was Re: S-expressions vs. XML) From: "Alaric Snell" <alaric@a...> > One other point: Don't confuse LISP and s-exprs, as a few posts I've just > seen on this kind of do. > > s-exprs are a way of writing information, kind of like XML. > > LISP is a language based around an s-expr data model that happens to use > s-exprs also for its written syntax, kind of like XSLT. And similarly, don't confuse XML with its infoset, let alone the PSVI. It is the infoset or PSVI or canonical XML that are (paradoxically) closest to S-exprs (if we accept properties as part of S-exprs). S-exprs as such have no equivalent to the XML encoding PI or entities. Without a convention for unambiguously labelling the encoding of the print form and for allowing a plurality of encodings, S-exprs just perpetuate the character set mess that XML helped us escape from. Another contribution to the XML==S-expr discussions, and one which also blythely ignores any issues of encoding is the new version of Wadler and Simeon's "The Essence of XML", worthwhile reading at http://www.research.avayalabs.com/user/wadler/papers/xml-essence/xml-essence.pdf I made some comments on a previous draft on XML-DEV in "Not the essense of XML" http://lists.xml.org/archives/xml-dev/200207/msg00836.html, The most impressive thing about this may be the politeness of the authors: they say "XML is touted as an external format for representing data. This is not a hard problem. All we require are two properties...Lisp S-expressions, for example, possess these properties.//XML possesses neither property." Where is the politeness? Rather than saying "The people who use XML for more than it was designed for may be mad, bad or irrationally exuberent" they blame the messages. Yet in avoiding the issues of encoding and construction of documents from parts, they miss two other properties of an external format: "modularity" and "reliability". Many people thinking too much in terms of the XML Infoset seem to think that the issue of labelling encoding is peripheral to XML, whereas I think it is central. There are no other layers or channels for encoding labels to get passed, practically speaking; XML is basically the only format that deals with this issue. The rigorous labelling of character encoding is the essence of XML, just as much as the angle brackets or the element tree. I think Simeon and Wadler's basic introductory spin is still wrong: * The property of self-describing as they seem to use it (which I think is good), seems depends on there being enough lexical forms for each datatype. But by the time you add dates and derived types, you would need to extend basic S-expr syntax. You would need to know all the (primitive) types you wanted to support at syntax-design-time, which rather goes against the point of XML. And, at the other end, if you are only interested in the kinds of limited datatypes required for publishing (string and various symbols: token, tokens, ID, IDs, IDREF, IDREFs, enumerations, etc.), the lexical forms of markup and built-in DTDs are enough to make XML self-describing. * For the property of round-tripping, it strikes me that their argument only holds against XML Schemas and is nothing to do with plain XML, so they are still being free and loose to get a good title. Good for journalism, but surprising in an academic paper. So their title and opening section are misleading or wrong still: not the essence of XML but the essense of XML Schema. I guess by hanging around XQuery people all the time, all the authors ever hear of XML is XML+WXS conflated, but I wish they would spare the rest of us.* At least their abstract is correct. And the body of the paper? I found it very interesting on a lot of fronts, and well worth a delve. Cheers Rick Jelliffe * Perhaps it shows mindset at work that XQuery is "reforming" XML from a relatively untyped format with strings and tokens suitable for loosely-coupled systems which can be used with any datatyping convention, to a strongly typed format with a fixed number of primitive built-in types suitable for tightly-coupled systems: I heard a member of the XQuery WG say "without types you can't do anything!" ----------------------------------------------------------------- The xml-dev list is sponsored by XML.org <http://www.xml.org>, an initiative of OASIS <http://www.oasis-open.org> The list archives are at http://lists.xml.org/archives/xml-dev/ To subscribe or unsubscribe from this list use the subscription manager: <http://lists.xml.org/ob/adm.pl>
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format