[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Question on duplicate node elimination

Subject: Re: Question on duplicate node elimination
From: Lars Huttar <lars_huttar@xxxxxxx>
Date: Mon, 23 Aug 2010 15:54:05 -0500
Re:  Question on duplicate node elimination
 On 8/22/2010 5:12 PM, Hermann Stamm-Wilbrandt wrote:
>> I'm not sure what you find surprising about the results you are seeing.
>> What results would you expect?
> Not surprising.
>
> But how could the algorithm step of "duplicate elimination" be done?
> How can the duplicates be determined and removed, correctly?
>

If I'm understanding your question correctly (are you trying to
implement an XPath processor in XSLT 1.0?) I think it's impossible, if
you create the rtf simply using xsl:copy-of. Because as Mike said, once
you've copied nodes, the copies are distinct; there's no information in
the rtf(s) to distinguish copies of the same node from copies of
identical twins.

Could you create the rtf using a "special" attribute that preserves the
id of the node which you are copying? E.g.

	  <xsl:attribute name="originalID" namespace="http://hsw.org/specialNamespaceURI">
	    <xsl:value-of select="generate-id()" />
          </xsl:attribute>

Then you could use that originalID attribute to determine what nodes were identical in the original, and strip out the originalID attribute after using it.

But I guess this would only work on elements, since only elements can have attributes...

Lars



> Perhaps I was not clear enough with my question.
> How can this step (p. 40 from [1]) be implemented in XPath 1.0 plus
> eslt:node-set():
> A location step identifies a new mode-set relative to the context node-set.
> The location step is evaluated against each node in the context node-set,
> and the union of the resulting node-sets becomes the context node-set for
> the next step. Location steps consist of an axis identifier, a node test
> and zero or more predicates (see Figure 3-4). ...
>
>
> [1]
> http://www.theserverside.net/tt/books/addisonwesley/EssentialXML/index.tss
>
> Mit besten Gruessen / Best wishes,
>
> Hermann Stamm-Wilbrandt
> Developer, XML Compiler, L3
> WebSphere DataPower SOA Appliances
> ----------------------------------------------------------------------
> IBM Deutschland Research & Development GmbH
> Vorsitzender des Aufsichtsrats: Martin Jetter
> Geschaeftsfuehrung: Dirk Wittkopp
> Sitz der Gesellschaft: Boeblingen
> Registergericht: Amtsgericht Stuttgart, HRB 243294
>
>
>
> From:       Michael Kay <mike@xxxxxxxxxxxx>
> To:         xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Date:       08/22/2010 11:53 PM
> Subject:    Re:  Question on duplicate node elimination
>
>
>
> I'm not sure what you find surprising about the results you are seeing.
> What results would you expect?
>
> xsl:copy-of creates a new node. Copying the same node twice creates two
> copies with distinct identity. Is that the issue?
>
> Michael Kay
> Saxonica
>
> On 22/08/2010 22:25, Hermann Stamm-Wilbrandt wrote:
>> Hello,
>>
>> I have a question on duplicate node elimination.
>>
>>> From the XPATH 1.0 specification:
>> ...
>> * node-set (an unordered collection of nodes without duplicates)
>> ...
>> An initial sequence of steps is composed together with a following step
> as
>> follows. The initial sequence of steps selects a set of nodes relative to
> a
>> context node. Each node in that set is used as a context node for the
>> following step. The sets of nodes identified by that step are unioned
>> together. The set of nodes identified by the composition of the steps is
>> this union.
>> ...
>>
>> So "are unioned together" results in a node-set and that does not contain
>> duplicates.
>>
>> Now how can this algorithm step be realized in XPATH 1.0 plus
>> exslt:node-set
>> funtion?
>> (this would work in browsers with the technique from David Carlisle [1])
>>
>>
>> This is the output for below stylesheet simple.xsl on file simple.xml.
>> For the nodes four node /a/b/c their parents are copied into an
>> intermediate
>> result. But xsltproc and xalan show that the four nodes are different by
>> the
>> their generate-id() values, whereas the first pair and last pair are
>> representations of the same node.
>>
>> xsltproc        xalan
>> 1: id2659470    1: AbT0
>> 2: id2659470    2: AbT0
>> 3: id2659354    3: AbT1
>> 4: id2659354    4: AbT1
>>
>> 1: id2659234    1: AbT2
>> 2: id2659244    2: AbT3
>> 3: id2659254    3: AbT4
>> 4: id2659264    4: AbT5
>>
>> 1:<b>           1:<b>
>>      <c>1</c>         <c>1</c>
>>      <c>2</c>         <c>2</c>
>>    </b>             </b>
>> 2:<b>           2:<b>
>>      <c>1</c>         <c>1</c>
>>      <c>2</c>         <c>2</c>
>>    </b>             </b>
>> 3:<b>           3:<b>
>>      <c>1</c>         <c>1</c>
>>      <c>2</c>         <c>2</c>
>>    </b>             </b>
>> 4:<b>           4:<b>
>>      <c>1</c>         <c>1</c>
>>      <c>2</c>         <c>2</c>
>>    </b>             </b>
>>
>>
>>
>> $ cat simple.xml
>> <a>
>>    <b>
>>      <c>1</c>
>>      <c>2</c>
>>    </b>
>>    <b>
>>      <c>1</c>
>>      <c>2</c>
>>    </b>
>> </a>
>> $ cat simple.xsl
>> <xsl:stylesheet version="1.0"
>>    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>>    xmlns:exsl="http://exslt.org/common"
>>
>>    <xsl:output omit-xml-declaration="yes"/>
>>
>>    <xsl:template match="/">
>>      <xsl:variable name="rtf">
>>        <xsl:for-each select="/a/b/c">
>>          <xsl:copy-of select=".."/>
>>        </xsl:for-each>
>>      </xsl:variable>
>>
>>      <xsl:for-each select="/a/b/c">
>>        <xsl:value-of select="position()"/><xsl:text>:</xsl:text>
>>        <xsl:value-of select="generate-id(..)"/><xsl:text>&#10;</xsl:text>
>>      </xsl:for-each>
>>
>>      <xsl:text>&#10;</xsl:text>
>>
>>      <xsl:for-each select="exsl:node-set($rtf)/*">
>>        <xsl:value-of select="position()"/><xsl:text>:</xsl:text>
>>        <xsl:value-of select="generate-id(.)"/><xsl:text>&#10;</xsl:text>
>>      </xsl:for-each>
>>
>>      <xsl:text>&#10;</xsl:text>
>>
>>      <xsl:for-each select="exsl:node-set($rtf)/*">
>>        <xsl:value-of select="position()"/><xsl:text>:</xsl:text>
>>        <xsl:copy-of select="."/><xsl:text>&#10;</xsl:text>
>>      </xsl:for-each>
>>    </xsl:template>
>>
>> </xsl:stylesheet>
>> $
>>
>>
>> [1] http://dpcarlisle.blogspot.com/2007/05/exslt-node-set-function.html
>>
>>
>> Mit besten Gruessen / Best wishes,
>>
>> Hermann Stamm-Wilbrandt
>> Developer, XML Compiler, L3
>> WebSphere DataPower SOA Appliances
>> ----------------------------------------------------------------------
>> IBM Deutschland Research&  Development GmbH
>> Vorsitzender des Aufsichtsrats: Martin Jetter
>> Geschaeftsfuehrung: Dirk Wittkopp
>> Sitz der Gesellschaft: Boeblingen
>> Registergericht: Amtsgericht Stuttgart, HRB 243294
>
> X-Quarantine ID  /var/spool/MD-Quarantine/18/qdir-2010-08-22-18.13.01-001

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.