[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message]

Serialization of XDM

Pavel Minaev int19h at gmail.com
Mon Sep 14 14:29:30 PDT 2009


  Serialization of XDM
On Mon, Sep 14, 2009 at 1:16 PM, Michael Kay <http://x-query.com/mailman/listinfo/talk> wrote:
>> On the other hand, it seems to me that we already have such a
>> serialization format, and it's XQuery itself - or rather the
>> subset that involves literals, sequence "constructor" (i.e.
>> comma operator), and direct element/attribute/namespace/text
>> constructors. It's as portable as it gets - any XQuery
>> processor can immediately parse it.
>> And it doesn't seem to be missing anything from XDM, either.
>
> For some use cases it's important that the format be canonical: that is,
> there should only be one way of representing a given sequence.
>
> Also, XQuery doesn't have any direct way of representing the type annotation
> on a node: the closest you can get is a validate{} instruction, and the
> effect of this is very context-dependent.

I wonder if "treat as" operator could be abused for this purpose. The
deserializer would have to be specified as if it always runs the input
sequence through validate{} recursively, naturally...

Of course, there's still the problem of no inline schemas for XQuery,
so it's still not a fully self-contained format. Then again, in
practice, few people use inline schemas for plain XML serialization as
well, so it might not be that much of a problem.

> Incidentally, one deficiency of the Saxon -wrap format that I mentioned
> earlier is that it doesn't contain any information about node identity. If
> you have a sequence of three elements with the same name and content, it
> won't tell you whether the sequence being serialized contains three distinct
> nodes, or a single node repeated three times.

Good point. It's fairly obvious how this could be done with plain
XQuery in general with let/return, but I also can't think of any good
way to define a canonical serialization format for it. One approach
would be to require all nodes that occur more than once in the output
(and only such nodes) to produce a single let-clause, with unspecified
order and unspecified variable names they're bound to, at the
beginning of the query, immediately followed by a single "return"
clause that references them. But this wouldn't be very human-readable
for moderately sized graphs, IMO.


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2011 All Rights Reserved.