[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Documents, data and markup: YAML Ain't Markup Language
As Eric said, mixed content is a big one. In document applications, order tends to matter by default. In data applications, order tends not to matter except in specialized list contexts. Name/value pairs are probably the most convenient "fundamental data type". In documents, lists of elements tend to be. It is only because documents tend not to make heavy use of name/value pairs that XML can get away with such a weak notion of attributes (which, ironically, data-heads are often agitating to remove!) Because of the name/value orientation of data applications, it is usually safe to ignore an unknown element as an "extension". But in a document application unknown elements tend to have semantics that you really should deal with. A publisher can't say "I've never heard of a colophon, therefore I'll just throw it out." Data-oriented applications tend to want to map XML elements to objects (thus the emphasis on name/value pairs). Document-oriented applications tend to use a stream processing or visitor model. Data-oriented systems tend to distinguish between roles (fields/properties/attributes) and types. Documents tend to mix them all together (is "title" a role or a type of thing?). Data-oriented systems tend to prefer object types to be detectable independent of context (thus namespaces) whereas document processing is typically done top-down recursively so relying on context is natural. I am good friends with one of the inventors of YAML and I don't argue with him when he says that YAML is better for most data-oriented applications. I think he's probably right. But as somebody else said, what would be the cost in toolset complexity of having to master two different languages. If one could go back in time, one could approach the problem from scratch with the needs of document and data heads equally represented. It would not just be useful to combine them so we could reuse tools. It would be useful to combine them because most documents have a data-oriented subset (if only the "metadata" element at the top) and many data applications have a document-oriented subset (if only rich text fields). Another reason to combine them is that there is no clear boundary. There is a spectrum. But I'm sorry to say that that is not the way XML is. And by the way, if you consider RDF: * triples are roughly equivalent to name/value pairs (the third item in the triple is the "parent" object) * order does not matter by default * types and roles are distinguished * types and roles are context-free * triples with unknown predicates are easily ignored IMHO, is precisely the impedence mismatch between the data view of the world and XML that makes RDF look so ugly. As a data model, RDF is not far from ideal for most of the data-oriented applications I've done. I think that having a clean strategy for merging the two worlds is one of the big open questions in the XML world. Paul Prescod
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|