[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Serialization of XDM - Use cases / Proposal

  • From: "David A. Lee" <dlee@calldei.com>
  • To: Kurt Cagle <kurt.cagle@gmail.com>
  • Date: Sun, 20 Sep 2009 20:31:13 -0400

Re:  Serialization of XDM - Use cases / Proposal
Kurt, could you expand on what you think might be the advantages of a format such as your example ?

(<?xml version="1.0" encoding="UTF-8"?>,"foo",5^positiveInteger,<bar><bat/></bar>,<!-- foo -->)

I'm not at all opposed to multiple new serialization formats, although I'm inclined to think getting *one* more with any decent adoption is a ambitious goal, let alone 2.
Your example with RNG is interesting, but I don't think its quite a parallel.   With RNG the non-xml form is intended to be authored by humans, with a design goal of simple human editable representations.   In this case, so far none of the design goals (or use cases) I've come up with yet involve humans authoring the data.

In your example, what is the design intention for a non-xml format ?

In my mind, there is one example where non-xml format for sequences would be very useful but I'm not satisfied with how it would actually work in practice.
that is, I believe the most common actual production of XDM data happens to be either plain text, or a single XML item (element, document).
In both of those cases it would be really nice if the serialization happened to be the 'standard' serialization for those without any kind of wrapping at all,
(no (  )  or no <xdm:wrapper> .. etc)
That way if you just happened to produce a single XDM Item of type element or text there'd be no extra baggage.
I think that would be really cool.   But the only way I've thought of to achieve that is to use a sequence delimited format with no start and end markers.

My opinion is that if I'm going to have to parse "(" and "," I'd rather be parsing "<wrapper> ... </wrapper>" at least I wouldn't have to write a new (if even simple) parser and can simply read it as XML.   For example I would like to provide a 'sample implementation' of the serialize and parser written in pure XQuery as an additional way of describing the format besides prose.

But perhaps your thinking of a use case or design goal I have neglected.






David A. Lee
dlee@calldei.com  
http://www.calldei.com
http://www.xmlsh.org
812-482-5224


Kurt Cagle wrote:
6fa681b10909201617h657b33bdk109e6b542e34be4a@mail.gmail.com" type="cite">I'm not unaware of most of the implications of this format, but I still think it's one that's worth thinking on.

For purposes of discussion, suppose that you arbitrarily split sequence serialization from single-item serialization into non-XML formats because I believe they are actually qualitatively different problems. Referring only to the sequence serialization side of the problem here, I think the question is whether XML sequence serialization and parsing has to in fact be consumable by an XML parser. As I see it, you either end up specifying some arbitrary set of privileged xml sequence tags:

<?xml version="1.0" encoding="UTF-8"?>
<xml:sequence xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xml:item value="foo" type="xs:string"/>
    <xml:item value="5" type="xs:positiveInteger"/>
    <xml:item type="document"><bar><bat/></bar></xml:item>
    <xml:item type="comment">foo</xml:item>
</xml:sequence>

or you work with a direct serialization as described earlier, possibly with RDF encodings for type:

(<?xml version="1.0" encoding="UTF-8"?>,"foo",5^positiveInteger,<bar><bat/></bar>,<!-- foo -->)

Non-native-xml items, such as binary classes invoked through extensions in XQuery or XSLT, would be a more complex proposition, but otherwise I don't really see where you'd have that much trouble with the notation. It would require a modification of any XDM aware application to handle the latter, but I don't necessarily see that as being that major an issue at this stage.

I could see this approach mirroring the approach that RNG utilizes - providing two equivalent representations, one in XML, the other as a compact notation. The serializer in this case would work the way it always does - you would describe the sequence serialization method and possibly content type, and make a distinction between xsx - xml serialization - and xsc - compact notion serialization.

Kurt Cagle
Managing Editor
http://xmlToday.org


On Sun, Sep 20, 2009 at 2:29 PM, Michael Kay <mike@saxonica.com> wrote:

 
I'm going to ask what may be an obvious question, but wouldn't it make sense for a serialization of a sequence to correspond on the output to the serialization on the input? That is to say, if you had a structure:

("foo",5,<bar><bat/></bar>,<!-- foo -->)

 
The main disadvantage of such a format is that it uses non-XML markup (parentheses and commas) which makes it difficult to parse using tools that are specialized to handling XML markup, for example XSLT and XQuery.
 
Also, it doesn't solve the problem of retaining type annotations, for example the difference between the integer 5 and the positiveInteger 5.
 
 

Regards,

Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.