[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML Redux
I like Dave Pawson's use of the <> as formal markup delimiters, but I'd still kind of point to the XQuery XDM and question whether, with a few syntactic shortcuts you couldn't get something that still satisfies the XDM while at the same time giving you a JSON-esque notation. Consider the following: ("This is a test",<foo>This is <bar>an element</bar> inside an element</foo>,12,25,<bin bat="term">More text</bin>) Rewrite this in XQuery constructor notation:
("This is a test", element foo {('This is ',element bar {'an element'},' inside an element.')},12,25,element bin {(attribute bat {"term"},"More text"}))
Replace element foo with *foo: (), attribute bar with @bar: () : ("This is a test",*foo: ("This is ",*bar: ('an element'),'inside an element'),12,25,*bin: (@bar: "term","More text"))
You could even go a step further by assuming that the constructs *foo: () automatically "escapes out" of text. Additionally sequence items that need to be separated could be placed in a [] structure:
(This is a test *foo: (This is *bar: (an element) inside an element),[12,25],*bin: (@bar: (term) More text)) HTML would be encoded as *html: (*head: (*title: (This is the top title) *link: (@rel: (stylesheet) @href:(my.css)) *body: (*h1: (This is the page title) *p:(This is a *b: (test).)))
Finally, it may be possible to eliminate the * notation altogether: html: (head: (title: (This is the top title) link: (@rel: (stylesheet),@href: (my.css)) body: (h1: (This is the page title) p: (This is a b: (test).)))
This doesn't break XML, beyond the document vs. grove issue (which has always been one of the more questionable characteristics of the XML spec), is compact, more or less readable, and can be readily mapped to JSON. For instance, consider a structure of the form:
For instance, a list of strings could be differentiated with []: colors: (["red","green","blue","yellow"]) JSON would interpret this as:
{colors: ["red","green","blue","yellow"]} while XML would interpret it as <colors xsi:type="xs:NMTokens">red green blue yellow</colors>
or, worst case: <colors> <xml:null>red</xml:null> <xml:null>green</xml:null>
<xml:null>blue</xml:null> <xml:null>yellow</xml:null>
</colors> (The case of a list of strings is one where the XDM is superior to the serialization model, since the angle bracket serialization has no notion of the concept of a list).
This is a declarative description, not a functional one, but that doesn't mean that you couldn't take advantage of XQuery like constructs: let $title1 := "This is the top title"
let $title2 := "This is the page title" let $page := html: (head: (title: ({$title1}) link: (@rel: (stylesheet) @href: (my.css)) body: (h1: ({$title2}) p: (This is a b: (test).))) return $page
and as white space isn't that much of a concern: let $page := html: ( head: (
title: ({$title1}) link: ( @rel: (stylesheet) @href: (my.css) ) ) body: (
h1: ({$title2}) p: (This is a b: (test).) ) ) Note that this is the primary reason why I haven't used the curly brace for this particular notation; it's become too thoroughly established as an escape mechanism for the underlying scripting environment.
Finally, taking Michael Kay's example: { authors: [ {name: "Michael Kay", affiliation: "Saxonica"}, {name: "Liam Quin", affiliation: "W3C"} ] abstract:<para { style : "bold" }>Here be some dragons</para> content:<section { numbers : [1,1,2] }><para>...</para></section> } remap that in the above notation: (authors: ( [null: (name: (Michael Kay) affiliation: (Saxonica)), null: (name: (Liam Quin) affiliation: (W3C))]
abstract: (para: (@style: (bold) Here be dragons) content: (section: (numbers: ([1,1,2]) para: (...)) ) or, if you use the notation : ( by itself to indicate an "anonyous" class:
(authors: ( [ : (name: (Michael Kay) affiliation: (Saxonica)), : (name: (Liam Quin) affiliation: (W3C))] abstract: (para:(@style: (bold) Here be dragons)
content: (section:(numbers: ([1,1,2]) para: (...)) ) Seems pretty straightforward to me, should be fairly easily parseable, and has the advantage of being trivial to wrap within a string. Additionall, "foo : (bar)" is not exactly a common construct lexically, even without whitespace, and escaping it could simply involve the use of a construct such as `foo: (bar)`, with the "`" character indicating that the string should be interpreted literally.
The exact nature of the notation can be argued, but I think the important point to consider is that while the serialization model of XML is not fully congruent with JSON, XDM is. Which means that any discussion about a MicroXML needs to be looking at XDM, rather than the XML 1.0 serialization model, as the basis for that simplification.
This is something that I think has been missing in all of the discussions thus far. This is not a notational issue, it's a data modeling one. There are simply constructs that cannot be modeled readily in JSON that are easily rendered in XML angle bracket notation (ABN) and vice versa, because ABN has no mechanism for defining arrays that doesn't rely either upon a convention (white space NMTokens) while JSON notation for handling semi-repeating XML structures (such as <a>1</a><a>2</a><b>3</b><a>4</a>) can get hideously complex fast. Yet an XDM notation could represent both cases trivially.
Kurt Cagle Invited Expert, Forms Working Group, W3C 443-837-8725 On Tue, Feb 15, 2011 at 10:49 AM, Michael Kay <mike@saxonica.com> wrote:
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|