[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

OT: request help with string-based/ RegEx problem

Subject: OT: request help with string-based/ RegEx problem
From: "Jakob" <jakob@xxxxxxxx>
Date: Thu, 1 Apr 2004 12:04:18 +0200 (CEST)
regex word
Hello everybody,

I have the following problem:

I need to find any one and two-character words in my
document, like "L", "GG", "Bz", "mm", but also entities
representing a character like this "&#8711;" (up-turned
delta) etc.  As any such combination is possible, this
would make a very long list.  Once found, I'd like to
surround this strings with an element each, like this: 
<sym>L</sym>, <sym>GG</sym>, <sym>&#8711;</sym> ...

Furthermore, I am not interested in these character
sequences when they are found inside certain elements, for
example: <xref refid="abc">Part B</xref>, I do not want to
tag the "B".  There's a limited number of such exclusions.

My understanding is that handling this in XSLT (1.0, at
least) is not possible.  I cannot currently switch to 2.0,
so I thought the best way would be to use regular
expressions (as an ant task) that accomplish the same
goal.

While I have no trouble creating a regex that finds me all
one or two-character words, I have not found a way yet to
express the contextual constraints.

The following is a "pseudo regex" expressing this idea:

------8<------
not following <xref[^>]+> or <syd1>[^>]+> ...
  (.*)
  <                              ==> word start
  ([a-zA-Z] | [a-zA-Z][a-zA-Z])  ==> target
  >                              ==> word end
  (.*)
not before </xref> or </syd1> ...
------8<------

Again, I am conscious this can be regarded as off-topic. 
And also, if there's an XSL-based solution, or a different
approach altogether, I am happy to learn.

Thanks in advance.

Cheers,
Jakob.

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.