[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: xsl:analyze-string problem

Subject: Re: xsl:analyze-string problem
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Thu, 08 Feb 2007 18:16:21 +0100
Re:  xsl:analyze-string problem
Yves Forkl wrote:
Hi XSLT 2.0 wizards,

while the syntax and semantics of xsl:analyze-string have become clear to me, I am now in search of an idiom implying it which it could help me solve this problem. (Or maybe of an alternative...)

In the input I find elements like these:

1) <e> def ghi</e>
2) <e> abc 22 def 3 ghi 1. </e>
3) <e> 2. </e>
4) <e> 3. def 35 78 ghi </e>

The possible contents fit into exactly 4 classes:

1) just some words and/or numbers
2) like 1), but followed by a number and a period
3) just a number and a period
4) like 3), but followed by some words and/or numbers

I understand that a number and period can only appear once at the end or beginning. Other numbers do not (never) have a period imm. following them. And between the number and the spaces there is nothing.


In each case, spaces may or may not appear at beginning and end of the content and must be preserved (no matter to which group they get attached).

The problem consists of replacing the original "e" element by creating new elements according to these rules:

A) A number followed by a period goes into a "ordinal" element.

This will become the xsl:matching-substring part.


B) Words and numbers go into a "text" element.

This is in the xsl:non-matching-substring.


C) In cases 1) and 4), where words and numbers appear at the end, the content of the current "e" element must be concatenated with all adjacent "e" elements of type 1) and 2) before putting it all into the "text" element.

I assume by concatenation, you mean text-concatenation, and not something else, like sibling nodes?


By contrast, in cases 2) and 3) which are ended by a number and a period the contents of the following "e" instance should never be appended.

If I understand it well, this should result in <text>...</text> blocks that each contain one <ordinal> element at the beginning or the end.



What is not clear to me is:


- whether the regex actually suffices to match the rules

I'm not sure either, but I'd chosen a simpler rule



- if it is a good idea to use xsl:for-each there

I think: no, though i do find it original




- how to assure concatenation of all the "e" instances' contents in cases 1) and 4) without processing them repeatedly - i.e.: how can I restrict the call to xsl:apply-templates to cases 2) and 3)?

I think, you should make it much easier on yourself. Here's an approach you can try:


<xsl:template match="/">
<xsl:variable name="parsed">
<xsl:apply-templates select="$data/e" />
</xsl:variable>
<xsl:copy-of select="$parsed" />
</xsl:template>
<xsl:template match="e">
<!-- match beginning/ending with ordinal in $1 or $2 -->
<xsl:analyze-string select="." regex="^(\s*\d+\.)|(\d+\.\s*)$">
<xsl:matching-substring>
<ordinal
start="{('yes')[regex-group(1)]}"
end="{('yes')[regex-group(2)]}">
<xsl:value-of select="." />
</ordinal>
</xsl:matching-substring>
<xsl:non-matching-substring>
<text><xsl:value-of select="." /></text>
</xsl:non-matching-substring>
</xsl:analyze-string> </xsl:template>



This will output (with $data set to your data), the following


<text> def ghi</text>
<text> abc 22 def 3 ghi </text>
<ordinal start="" end="yes">1. </ordinal>
<ordinal start="yes" end=""> 2.</ordinal>
<text> </text>
<ordinal start="yes" end=""> 3.</ordinal>
<text> def 35 78 ghi </text>


This is a temporary result tree. You can re-apply it on the place where I just placed a xsl:copy-of command. If 'start' is 'yes', it means: the ordinal was at the beginning of an 'e' element. If attribute 'end' is 'yes', it means that the ordinal appeared at the end of a string in an 'e' element. As you can see, the spaces are preserved.


All you need to do is gather the preceding-sibling that you need, based on your demands of concatenation rules.

Good luck coding!

Cheers,
-- Abel

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.