[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: normalize as part of a 'select-distinct' in a for-each?

Subject: RE: normalize as part of a 'select-distinct' in a for-each?
From: Mike Brown <mbrown@xxxxxxxxxxxxx>
Date: Sun, 3 Oct 1999 18:02:06 -0600
xpath select distinct
> I modified David Carlisle's example (FAQ 2.4) to use 
> normalize() since whitespace distinctions are not desired.
> However, when I add normalize(), the stylesheet stops 
> returning the expected "XYZ" and instead gives "XXXXYZZ" 
> What am I doing wrong here?

I can explain why you're getting these results, but I don't have a way to
solve your problem. What you are doing wrong is trying to normalize up to 9
text nodes at a time.

http://www.w3.org/TR/xpath#axes: "the following axis contains all nodes in
the same document as the context node that are after the context node in
document order, excluding any descendants and excluding attribute nodes and
namespace nodes."

First, why do you get 'XYZ' without the attempt at normalization?

	//c[not(text()=following::c/text())]

//c will test all the "c" element nodes in document order. Only those for
which [...] is true will be selected. The sort order you specified will be
applied to these selected nodes for purposes of iterating through your
xsl:for-each.

For each node being tested, text() is a node-set with just one member: the
'X', 'Y', or 'Z' text node child, as expected. following::c/text() is a
node-set with every text node child of every "c" element node from that
point in the document onward, (not counting descendants of the node being
tested).

http://www.w3.org/TR/xpath#booleans: "If both objects to be compared are
node-sets, then the comparison will be true if and only if there is a node
in the first node-set and a node in the second node-set such that the result
of performing the comparison on the string-values of the two nodes is true"

http://www.w3.org/TR/xpath#section-Text-Nodes: "The string-value of a text
node is the character data"

So then, is going through the //c elements, is the "text()" node-set equal
to the "following::c/text()" node-set? The answer, in the fourth column, is
true (i.e., yes, they are equal) if the item in the second column **can be
found in** the third.

//c:     	text():	following::c/text():               	result:
<c>X</c>	'X'    	'Y','X','Z','Z','Z','X','Z','X','X'	true
<c>Y</c>	'Y'    	'X','Z','Z','Z','X','Z','X','X'    	false
<c>X</c>	'X'    	'Z','Z','Z','X','Z','X','X'        	true
<c>Z</c>	'Z'    	'Z','Z','X','Z','X','X'            	true
<c>Z</c>	'Z'    	'Z','X','Z','X','X'                	true
<c>Z</c>	'Z'    	'X','Z','X','X'                    	true
<c>X</c>	'X'    	'Z','X','X'                       	true
<c>Z</c>	'Z'    	'X','X'                            	false
<c>X</c>	'X'    	'X'                                	true
<c>X</c>	'X'    	(empty)                             false

Therefore, //c[not(text()=following::c/text())] will select the //c items
that are not true, which just happened to be these elements:
	<c>Y</c>
	<c>Z</c>
	<c>X</c>
...which you then sorted in ascending order and looked at the string values
of to produce 'XYZ'.

Second, why did you get 'XXXXYZZ' when you applied normalize() to the
node-sets in the second and third columns?

http://www.w3.org/TR/xpath#section-String-Functions: "The normalize function
returns the argument string with white space normalized ..." [and] "A
node-set is converted to a string by returning the string-value of the node
in the node-set that is first in document order. If the node-set is empty,
an empty string is returned."

//c:     	text():	following::c/text():	result:
<c>X</c>	'X'    	'Y' (and others)    	false
<c>Y</c>	'Y'    	'X' (and others)    	false
<c>X</c>	'X'    	'Z' (and others)    	false
<c>Z</c>	'Z'    	'Z' (and others)    	true
<c>Z</c>	'Z'    	'Z' (and others)    	true
<c>Z</c>	'Z'    	'X' (and others)    	false
<c>X</c>	'X'    	'Z' (and others)    	false
<c>Z</c>	'Z'    	'X' (and others)    	false
<c>X</c>	'X'    	'X'                 	true
<c>X</c>	'X'    	(empty)             	false

Thus, //c[not(normalize(text())=normalize(following::c/text()))] selects:
	<c>X</c>
	<c>Y</c>
	<c>X</c>
	<c>Z</c>
	<c>X</c>
	<c>Z</c>
	<c>X</c>
...which, when sorted and so on produces 'XXXXYZZ'.


The solution is a little beyond me, though. I'd assume that you'd have to do
it with recursive template calls that mimic the XPath evaluation above, but
with normalize() thrown in. It wouldn't be efficient at all. Why don't you
just normalize your source data first :)


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.