[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Documents, data and markup: YAML Ain't Markup Language


yaml documents
As Eric said, mixed content is a big one.

In document applications, order tends to matter by default.

In data applications, order tends not to matter except in specialized 
list contexts.

Name/value pairs are probably the most convenient "fundamental data 
type". In documents, lists of elements tend to be. It is only because 
documents tend not to make heavy use of name/value pairs that XML can 
get away with such a weak notion of attributes (which, ironically, 
data-heads are often agitating to remove!)

Because of the name/value orientation of data applications, it is 
usually safe to ignore an unknown element as an "extension". But in a 
document application unknown elements tend to have semantics that you 
really should deal with. A publisher can't say "I've never heard of a 
colophon, therefore I'll just throw it out."

Data-oriented applications tend to want to map XML elements to objects 
(thus the emphasis on name/value pairs). Document-oriented applications 
tend to use a stream processing or visitor model.

Data-oriented systems tend to distinguish between roles 
(fields/properties/attributes) and types. Documents tend to mix them all 
together (is "title" a role or a type of thing?).

Data-oriented systems tend to prefer object types to be detectable 
independent of context (thus namespaces) whereas document processing is 
typically done top-down recursively so relying on context is natural.

I am good friends with one of the inventors of YAML and I don't argue 
with him when he says that YAML is better for most data-oriented 
applications. I think he's probably right. But as somebody else said, 
what would be the cost in toolset complexity of having to master two 
different languages.

If one could go back in time, one could approach the problem from 
scratch with the needs of document and data heads equally represented. 
It would not just be useful to combine them so we could reuse tools. It 
would be useful to combine them because most documents have a 
data-oriented subset (if only the "metadata" element at the top) and 
many data applications have a document-oriented subset (if only rich 
text fields). Another reason to combine them is that there is no clear 
boundary. There is a spectrum.

But I'm sorry to say that that is not the way XML is.

And by the way, if you consider RDF:

  * triples are roughly equivalent to name/value pairs (the third item 
in the triple is the "parent" object)
  * order does not matter by default
  * types and roles are distinguished
  * types and roles are context-free
  * triples with unknown predicates are easily ignored

IMHO, is precisely the impedence mismatch between the data view of the 
world and XML that makes RDF look so ugly. As a data model, RDF is not 
far from ideal for most of the data-oriented applications I've done.

I think that having a clean strategy for merging the two worlds is one 
of the big open questions in the XML world.

  Paul Prescod


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.