[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Processing two documents, which order?

Subject: Re: Processing two documents, which order?
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Thu, 07 Apr 2011 15:25:55 +0100
Re:  Processing two documents
On 07/04/2011 14:25, Dave Pawson wrote:

I have two xml documents. The first is a list of marked up words (1), the second a 'normal' xml document (2)

For each occurrence in 2 of a word from 1
I need to mark up the word with<property>  </property>

Which order is anywhere near optimum?
Document 1 has about 300 words,
Document 2 is 33,000 lines.
I'm having trouble seeing how this description of the problem relates to the code given below.

From first principles, if you do a nested loop then you're doing either 300*33000 operations or 33000*300 - its not a big difference either way. On the other hand if you use keys, then you are basically doing 300+33000 operations either way - but the key will be smaller if you build it on the smaller document, so that's what I would do.

Using regex matching with a dynamically computed regex looks like bad news - or is it really a regex in the source document? Saxon precompiles the regex if it's known statically, but if not there's no caching or anything - it gets compiled on each use. From this viewpoint, using each regex once (in a single analyze-string call) is going to be better.

Michael Kay
This is the template to do the work

<xsl:template match="*">
     <xsl:param name="property" as="xs:string"/>
     <xsl:analyze-string select="." regex="({$property})[\s\p{{P}}]">
<!--	<xsl:message>match on [<xsl:value-of
select='regex-group(1)'/>]</xsl:message>  -->
select="regex-group(1)"/></property>  </xsl:matching-substring>
	<xsl:copy-of select="."/>

but I'm hesitating as to which loop sequence will work best?

Current Thread


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.