[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Testing 2 XML documents for equality - a solution

Subject: RE: Testing 2 XML documents for equality - a solution
From: Mukul Gandhi <mukul_gandhi@xxxxxxxxx>
Date: Sun, 3 Apr 2005 09:43:20 -0700 (PDT)
comparing two xml in xslt
Hi Mike (and all),
  I have attempted to define the problem (of comparing
2 XML documents). Its in a pdf form. Any one may
access the file from location
http://gandhimukul.tripod.com/comparing_xml_documents.pdf
(size appx 198 KB).

Some spelling and grammer errors are expected. It is
unintentional. Any disrespect reflecting from such
errors is unintentional.

I have kept various messages of this thread intact
below, so that it may help to understand the backgound
of the problem, if somebody wishes to know.

I have not done XSLT modifications to my earlier XSLT
I posted. I'll do so after recieving feedback on this
work..

All suggestions, debates and corrections are welcome..

I am keeping my fingers crossed.

Regards,
Mukul

--- Michael Kay <mike@xxxxxxxxxxxx> wrote:

> You're still struggling a bit.
> 
> Let's start with requirements. What is this for?
> This is part of the
> difficulty: there are many reasons for wanting to
> compare two XML documents,
> and the different requirements don't necessarily
> lead to the same
> specification. If you describe some use cases this
> will help you on the way.
> For example, it will tell you whether it's enough to
> give a boolean answer,
> or whether you need to pinpoint where the two trees
> differ.
> 
> The next step is specification. This doesn't have to
> be mathematical, but it
> does have to be rigorous. Specifying it in terms of
> a comparison of two
> drawings of the trees being alike isn't going to be
> helpful. I know what
> you're getting at: you're trying to say that there's
> a one-to-one
> correspondence between the nodes and arcs in one
> tree and the nodes and arcs
> in the other. But you haven't said which properties
> of the nodes are
> important (namespace prefix? base uri? type
> annotation?), you haven't said
> how you will compare values (string comparison, with
> or without Unicode
> normalization? Collations? typed value comparison?),
> and you haven't said
> how you will handle the significance of ordering.
> 
> Finally, implementation (which is where you
> started). Before you embark on
> an implementation you should have an idea of the use
> cases (see above) and
> their performance requirements. For example, is the
> algorithm to be
> optimized for comparing trees that are probably the
> same or very similar, or
> for comparing trees that are likely to be wildly
> different?
> 
> Sorry if this is a bit severe: but you did ask for
> help. 
> 
> Michael Kay
> http://www.saxonica.com/
> 
> 
> 
> > -----Original Message-----
> > From: Mukul Gandhi [mailto:mukul_gandhi@xxxxxxxxx]
> 
> > Sent: 31 March 2005 22:49
> > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > Subject: Re:  Testing 2 XML documents for
> equality - a solution
> > 
> > Hi Dimitre,
> >   Below is the "scope" of my solution. My
> definition
> > of equality of XML documents consists of 2 parts:
> > 
> > Part 1) Node types, to which the stylesheet does
> > comparison
> > -------
> > "XPath 1.0" trees define 7 kinds of nodes. These
> are
> > listed below. I have marked yes or no against node
> > types, indicating whether my stylesheet has logic
> to
> > compare these nodes. If XML documents have nodes
> of
> > kind which are marked "no", then my stylesheet may
> > give wrong result(I have not done any testing for
> no
> > marked nodes)..
> > 
> > root nodes - yes
> > element nodes - yes
> > text nodes - yes
> > attribute nodes - yes
> > namespace nodes - no
> > processing instruction nodes - no
> > comment nodes - no
> > 
> > Part 2) My notion of equality of 2 XML documents
> > -------
> > Imagine that the XPath tree of 2 documents are
> *drawn
> > on paper*. The diagram is just similar to the
> XPath
> > tree diagram in Mike's book (XSLT 2nd Edition,
> > Programmer's Reference) page 57(section "The Tree
> > Model"). 
> > 
> > If XPath tree of 2 XML documents will "look same"
> on
> > paper (as in Mike's book's page 57), the documents
> > will be considered equal by my stylesheet. 
> > 
> > The scope of my stylesheet presently covers only
> these
> > 2 points.
> > 
> > I don't claim any other capability from my
> stylesheet.
> > 
> > I have not attempted to equate the XML documents
> in
> > terms of mathematical terms (like relations as you
> > mentioned; the subject I don't understand well) or
> > canonical terms(as defined by the canonical XML
> spec).
> > 
> > So considering the above scope of my work, can my
> > stylesheet be evaluated for correctness? 
> > 
> > I have deep regard for people who participated on
> this
> > thread.. They surely have deep knowledge of the
> > subject.
> > 
> > Regards,
> > Mukul
> > 
> > --- Dimitre Novatchev <dnovatchev@xxxxxxxxx>
> wrote:
> > > Hi Mukul,
> > > 
> > > 
> > > On Thu, 31 Mar 2005 04:36:32 -0800 (PST), Mukul
> > > Gandhi
> > > <mukul_gandhi@xxxxxxxxx> wrote:
> > > > Hi Dimitre,
> > > >  I am really not good at mathematics at this
> > > level. I
> > > > did studied about relations like "symmetric,
> > > reflexive
> > > > and transitive" time back. But I did so just
> to
> > > score
> > > > grades. I had no idea then their practical
> use..
> > > It is
> > > > indeed enlightening for me to know they have
> real
> > > > practical use (in XML & XSLT!). I cannot
> define my
> > > > problem in these terms.. As my knowledge is
> > > limited.
> > > 
> > > This confirms the conclusion that here we see
> > > attempts at offering a
> > > solution to a problem that is not well defined.
> > > 
> > > How can we then judge the solution? 
> > > 
> > > > 
> > > > I would be happy if you can define in these
> > > precise
> > > > terms the problem I am trying to solve(based
> on my
> > > > earlier posts to this thread).
> > > 
> > > Impossible.
> > > 
> > > >  I'll keep it as a
> > > > reference for future use. I defined the
> problem (I
> > > am
> > > > trying to solve) from an average programmer's
> > > point of
> > > > view.. And I think that it is quite
> understandable
> > > to
> > > > an average programmer ;)
> > > 
> > > A number of very wise people already explained
> why
> > > this is difficult
> > > to define -- they also found holes in your
> > > definition (and
> > > understanding) of the problem. These people
> > > obviously are not average
> > > programmers.
> > > 
> > > Cheers,
> > > Dimitre Novatchev.
> > > 
> > > 
> > 
> > 
> > 		
> > __________________________________ 
> > Yahoo! Messenger 
> > Show us what our next emoticon should look like.
> Join the fun. 
> >
> http://www.advision.webevents.yahoo.com/emoticontest
> 
> 



		
__________________________________ 
Do you Yahoo!? 
Take Yahoo! Mail with you! Get it on your mobile phone. 
http://mobile.yahoo.com/maildemo 

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.