[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: XSLT match with regex what's the best current solu

Subject: RE: XSLT match with regex what's the best current solution?
From: "Michael Kay" <michael.h.kay@xxxxxxxxxxxx>
Date: Wed, 16 Jan 2002 09:37:32 -0000
xsl match
> I am working on a suite of scripts that induce structure in
> free text and eventually capture fine grained medical information.
> I have been using AWK so far, but I am thinking about making
> this a process largely of XML transformations. However, since I
> must induce XML structure from semi-structured free text I need
> some more parsing support. First, regular expressions. I know
> there is EXSLT but are regex matches and replaces supported
> in SAXON (I love SAXON, so I would prefer using it.)

Saxon doesn't currently have any regex support (not even the limited
facilities described in EXSLT, nor those in the draft XSLT 2.0 WD, let alone
the more sophisticated facilities being discussed on this list).

But it shouldn't be too difficult to write some Saxon extension functions
that call functions in a regular expression library. (There are a number of
such libraries around, and I haven't done a detailed evaluation or
comparison. I believe there's one in apache jakarta, one in IBM alphaworks,
one in the JDK 1.4 beta.) You might be able to call these libraries
directly, or you may find it's easier if you write some wrapper code around
> Also, any ideas of additional parsing tools and their integration
> into XSLT would be appreciated. Is there a way of running XSLT
> in line-mode and have every line matched against regular
> expressions? Well, I suppose so, with a simple sed script I could
> first wrap each line into a <line>...</line> tag and then use regex
> match on the text node of each <line> element.

You could break the text into lines using the saxon:tokenize() extension
> Is SAXON easy to extend? I suppose there is some documentation
> of SAXON that tells me how to write extensions in Java, right?
> Any reason why it would be better to use something other than
> SAXON if my platform is Java and I'm not interested in Web stuff
> (in which case I would look into the Apache work.)
If you know Java, writing Saxon extension functions isn't difficult. It's
described in the extensibility.html file that comes with the download.

(If you have any specific problems, please raise them on the Saxon help list
at saxon.sf.net)

Mike Kay

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Current Thread


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.