[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Semantic equivalence of xml documents


semantic equivalence
At 13:46 +0200 2005-09-26, Yves Langisch wrote:
>Michael,
>
>I mean for example that
>
><company>
>	<person>Jim</person>
>	<person>John</person>
></company>
>
>is equivalent to
>
><company>
>	<person>John</person>
>	<person>Jim</person>
></company>
>
>Yves

Yves,

Whether these two are "equivalent" depends on your situation. They 
are of course not precisely "equal", but for your particular schema 
and your particular purposes the count as equal. That's fine, but 
perhaps in someone's schema the first <person> in each <company> is 
the president, so the two cases would have very different semantics.

Once you define *exactly* what you mean by equivalence, you will be 
able to find tools to help check it. You will need to decide very 
specifically what equivalence means. You may need a rule for each 
element. It is probably true that only some of your elements can be 
re-ordered this way, not all of them. You will have to tell the 
computer exactly which ones.

If the only unusual case is reordering, then one way to do your tests 
would be to write XSLT to sort the elements in question (like 
<person> here). Then, after the sorting, you could just do a regular 
comparison.

Depending on your situation, you may have harder problems. For example, is

    <p style='first-indent:1in'>

equivalent to

    <p style='first-indent:72pt;'>

I don't know of any tools that will help with that sort of thing. As 
Michael pointed out, if you also want to really check the "meaning" 
of the sentences within the document that is a very hard problem.

Usually, when we talk about two XML documents being "equivalent," we 
mean syntax, not semantics. This is  because the semantics of XML are 
not defined. So there are well-known methods to decide about syntax 
cases like

    <p type='foo'>

versus

    <p            type         =        "foo"          >

But for truly semantic issues like what you seem to need, the problem 
is much harder.

Once you define exactly what you mean by equivalence and get it 
written down very precisely, you may be able to check a lot of it 
with XSLT followed by syntax comparisons, or other methods. But until 
one knows precisely what is required, it's hard to advise about tools.

If you want assistance in figuring out just what equivalence means 
for your project, that may be a large question, and may be an 
appropriate area to seek help from any of the excellent consultants 
available on the list.

Steve DeRose


-- 
Luthien Consulting: Real solutions to hard information management problems
    Specializing in information design, XML, schemas, XSLT, and 
project design/review/repair
Steven J. DeRose, Ph.D., sderose@a...

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.