[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Testing 2 XML documents for equality - a solution

Subject: Re: Testing 2 XML documents for equality - a solution
From: Mukul Gandhi <mukul_gandhi@xxxxxxxxx>
Date: Mon, 4 Apr 2005 09:19:15 -0700 (PDT)
compare two xml strings
--- David Carlisle <davidc@xxxxxxxxx> wrote:
> 
> For the vast majority of nodes this is still a) very
> expensive way of
> comparing them and b) doesn't help with the
> comparison.

I agree ! I understand that generating the string hash
of the entire XML document is a expensive operation..
If I reflect deeply, I would imagine that even if 2
XML documents are different, they may generate same
concatenated string representation.. So my algorithm
will probably fail in some cases. But I have no proof
of my this new view. The XML examples with which I
worked over my stylesheet, gave right answer as I
expected. I'll test more to see if it shall fail for
some cases..

> For a given element node if you calculate an XPath
> to the current node,
> and then use that XPath to find a node in the other
> document, you have
> two nodes, you then need to compare whether they are
> equal, but that is
> _exactly_ the problem you are trying to solve. The
> earlier stylesheet
> just took the string value of the node but that is
> just the
> concatenation of all the element content so loses
> most of the markup
> information. 

I think you are right! (as always :) )

> What is wrong with the much simpler alternative of
> just writing out the
> string corresponding to a specific "canonical"
> linearisation, and then
> jsut comparing those two strings?

I think I should explore this option. But I believe
that converting a XML document to canonical form is
not a trivial task. For e.g. we need to convert
documents to UTF-8 . i.e. if XML document has encoding
ISO-8859-1 , then its canonical representation will
have UTF-8 encoding .. (this I think cannot be easily
accomplished with XSLT; infact I think it is
impossible with XSLT?) . I think, there are also other
canonicalization conversion rules which cannot be
easily done with XSLT. 

I think by using a SAX parser, it is probably easier
to convert XML to canonical form (ofcourse one must
know all the rules as well!)..

Regards,
Mukul

> David
> 
>
________________________________________________________________________
> This e-mail has been scanned for all viruses by
> Star. The
> service is powered by MessageLabs. For more
> information on a proactive
> anti-virus service working around the clock, around
> the globe, visit:
> http://www.star.net.uk
>
________________________________________________________________________
> 
> 


		
__________________________________ 
Yahoo! Messenger 
Show us what our next emoticon should look like. Join the fun. 
http://www.advision.webevents.yahoo.com/emoticontest

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.