[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Expensive XSLT2 - suggestions for improving?

Subject: Re: Expensive XSLT2 - suggestions for improving?
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Thu, 16 Oct 2008 12:34:38 -0400
Re:  Expensive XSLT2 - suggestions for improving?
Michael,

This is an interesting problem, and you may want to try a few things.

Part of what makes it interesting is the question of how widely you wish to scope your examination for similar values. In XSLT 2, a third argument can be used to define the scope within which the key works.

You could try something like this:

<xsl:key name="oid-by-value" match="@oid" use="string(..)"/>
<!-- retrieves an @oid attribute using the string value of its parent element -->


and then

<xsl:template match="value">
  <xsl:copy>
    <xsl:apply-templates select="@*"/>
    <xsl:for-each select="key('oid-by-value',.)[1] except .">
      <!-- traverse to the @oid of the first element with the
           same value, unless this is it -->
      <xsl:attribute name="refoid" select="string()"/>
    </xsl:for-each>
    <!-- skip content -->
  </xsl:copy>
</xsl:template>

if you wanted to scope only within the parent element, you could use key('oid-by-value',.,..)[1] -- the '..' as the third argument restricts the scope of retrieval.

Note: untested. (But if it won't work, surely some sharp-eyed XSLTer will notice.)

Cheers,
Wendell


At 11:54 AM 10/16/2008, you wrote:
Hello experts,

The task is to remove duplicate text content before moving an XML file
into translation. After the translation, the former duplicate content
should be recreated.

Assume this input XML (I dropped a lot of attributes):

<Doc>
<value oid="40068">Lasttrennschalter</value>
<value oid="40069">Umbau von N12 auf N4</value>
<value oid="4006a">Lasttrennschalter</value>
</Doc>

The third <value> should be empty because its content is identical to
the first, but we need a pointer to that first element to be able to
recreate the content after translation. Also, all original attributes
must stay unchanged. Therefore in each duplicate I insert an extra
attribute @refoid with the @oid of the source element. So I get this:

<Doc>
<value oid="40068">Lasttrennschalter</value>
<value oid="40069">Umbau von N12 auf N4</value>
<value oid="4006a" refoid="40068"/>
</Doc>

My XSL is very simple and works as intended, but it does not scale
very good, I guess because I look at preceding::value so many times:

<!-- Condenser: modify all duplicates -->
<xsl:template match="value[.=preceding::value]">
  <xsl:copy>
    <xsl:apply-templates select="@*"/>
    <xsl:attribute name="refoid"
      select="preceding::value[.=current()][last()]/@oid"/>
    <!-- skip content -->
  </xsl:copy>
</xsl:template>

<!-- pass-through all nodes and attributes -->
<xsl:template match="@*|node()">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

I guess a clever constructed key could help a lot... any pointers are
very welcome!


======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.