RE: Documents, data and markup: YAML Ain't Markup Language

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

To: 'Paul Prescod' <paul@p...>, Dare Obasanjo <dareo@m...>, xml-dev@l...
Subject: RE: Documents, data and markup: YAML Ain't Markup Language
From: Micah Dubinko <MDubinko@c...>
Date: Fri, 6 Jun 2003 13:25:00 -0700

So, how about a YAML syntax for RDF? Might even solve the tricky problem of
how to embed RDF (DTD-based) XHTML...

.micah

-----Original Message-----
From: Paul Prescod [mailto:paul@p...]
Sent: Friday, June 06, 2003 1:19 PM
To: Dare Obasanjo; xml-dev@l...
Subject:  Documents, data and markup: YAML Ain't Markup
Language

As Eric said, mixed content is a big one.

In document applications, order tends to matter by default.

In data applications, order tends not to matter except in specialized 
list contexts.

Name/value pairs are probably the most convenient "fundamental data 
type". In documents, lists of elements tend to be. It is only because 
documents tend not to make heavy use of name/value pairs that XML can 
get away with such a weak notion of attributes (which, ironically, 
data-heads are often agitating to remove!)

Because of the name/value orientation of data applications, it is 
usually safe to ignore an unknown element as an "extension". But in a 
document application unknown elements tend to have semantics that you 
really should deal with. A publisher can't say "I've never heard of a 
colophon, therefore I'll just throw it out."

Data-oriented applications tend to want to map XML elements to objects 
(thus the emphasis on name/value pairs). Document-oriented applications 
tend to use a stream processing or visitor model.

Data-oriented systems tend to distinguish between roles 
(fields/properties/attributes) and types. Documents tend to mix them all 
together (is "title" a role or a type of thing?).

Data-oriented systems tend to prefer object types to be detectable 
independent of context (thus namespaces) whereas document processing is 
typically done top-down recursively so relying on context is natural.

I am good friends with one of the inventors of YAML and I don't argue 
with him when he says that YAML is better for most data-oriented 
applications. I think he's probably right. But as somebody else said, 
what would be the cost in toolset complexity of having to master two 
different languages.

If one could go back in time, one could approach the problem from 
scratch with the needs of document and data heads equally represented. 
It would not just be useful to combine them so we could reuse tools. It 
would be useful to combine them because most documents have a 
data-oriented subset (if only the "metadata" element at the top) and 
many data applications have a document-oriented subset (if only rich 
text fields). Another reason to combine them is that there is no clear 
boundary. There is a spectrum.

But I'm sorry to say that that is not the way XML is.

And by the way, if you consider RDF:

  * triples are roughly equivalent to name/value pairs (the third item 
in the triple is the "parent" object)
  * order does not matter by default
  * types and roles are distinguished
  * types and roles are context-free
  * triples with unknown predicates are easily ignored

IMHO, is precisely the impedence mismatch between the data view of the 
world and XML that makes RDF look so ugly. As a data model, RDF is not 
far from ideal for most of the data-oriented applications I've done.

I think that having a clean strategy for merging the two worlds is one 
of the big open questions in the XML world.

  Paul Prescod

-----------------------------------------------------------------
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of OASIS <http://www.oasis-open.org>

The list archives are at http://lists.xml.org/archives/xml-dev/

To subscribe or unsubscribe from this list use the subscription
manager: <http://lists.xml.org/ob/adm.pl>

Follow-Ups:
- Re: Documents, data and markup: YAML Ain't Markup Language
  - From: Paul Prescod <paul@p...>

Prev by Date: Re: A few lessons I have learned (June, '03)
Next by Date: Re: Doc vs. Data
Previous by thread: Re: Relax NG - increasing acceptance at W3C?
Next by thread: Re: Documents, data and markup: YAML Ain't Markup Language
Index(es):
- Date
- Thread

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >