[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Testing 2 XML documents for equality - a solution

Subject: Re: Testing 2 XML documents for equality - a solution
From: Mukul Gandhi <mukul_gandhi@xxxxxxxxx>
Date: Wed, 30 Mar 2005 08:40:58 -0800 (PST)
xml ignore white
Hi David,
 Thanks a lot for your observations..
 
 Please read my response below your comments..

> I don't think the stylesheet really works.
> For example for attribute nodes you just concatenate
> the names and
> values so even if you could be sure that the order
> of attribute nodes
> was preserved (you can't be sure of this) then
> x="2" and x2="" would be considered equal.

Thanks a lot for pointing this bug! To correct this I
propose this alternative code (for both the
documents).
<xsl:for-each select="$doc1//@*">
  <xsl:value-of select="name()"
/><xsl:text>&#xa;</xsl:text><xsl:value-of select="."
/> 
</xsl:for-each>

(i.e. introducing an extra character between attribute
name and value, which is unlikely to occur in the
attribute value; for e.g. a newline character)

> Also your ignore white space test ignores far to
> much:
> 
> <xsl:for-each
>
select="$doc1//node()[not(normalize-space(self::text())
> = '')]">
>             <xsl:value-of select="name()"
> /><xsl:value-of select="." />
> 
> consider the 2 document fragments
> 
> <x>
>  <a/>
> </x>
> 
> 
> <y>
>  <b/>
> </y>
> 
> in the first document the nodes x and a and both the
> text nodes all
> satisfy
> normalize-space(self::text())= ''
> so the for-each will be empty.
> Similarly in the second fragment.
> 
> so presumably these documents will compare equal,
> which seems strange.

These documents are reported not equal! I think here I
am right! For this example, the $doc1//node() path
expression returns 4 nodes (2 element nodes and 2
"white space text nodes"). The "white space text
nodes" will be filtered by the predicate
[not(normalize-space(self::text()) = '')] ..

> Conversely you can not be sure that
> <x a="2" b="3"/> will compare equal to
> <x a="2" b="3"/>
> as teh attribute may be reported in one order for
> doc1 and teh other
> order for doc2.

I agree that the XML parser is not expected to report
attribute nodes in same order. But I guess we can
reasonably assume that a "specific XML parser" would
report attributes in same order. It must be having a
specific algorithm for this, whose outcome will be
predictable. I know I cannot theoretically prove
this.. But can you provide any practical evidence when
XML parser reports attributes in different order.. So
since 2 documents are being processed by the same
parser, the outcome will always be predictable!
I have tested the same example with a single product
multiple times, and always I am getting same result..

Regards,
Mukul

> David
> 
> 
>
________________________________________________________________________
> This e-mail has been scanned for all viruses by
> Star. The
> service is powered by MessageLabs. For more
> information on a proactive
> anti-virus service working around the clock, around
> the globe, visit:
> http://www.star.net.uk
>
________________________________________________________________________
> 
> 

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.