[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] tokenize() and regex-group ?
Hi all,
I'm tokenizing some text within a reccursiv template. The goal is to generates some linking with some "definitions" inside the doc. Let say my text is : "my foo bar" => 1st level of reccursion is searching for "bar" as defined anchor in the doc if not found, I increase a $lookBacklevel param : => 2nd level of reccursion is searching for "foo bar" and so on... till it finds a matching definition or throw an error if not. => when a definition is found, the text is output with a link : <p>... my <link idref="#anchorFooBar">foo bar</link> ...</p> To do so I (space-) tokenized the text : <xsl:variable name="tokenText" select="tokenize($text,' ')" as="xs:string*"/> and then make 2 strings depending on reccursion param $lookBacklevel <xsl:variable name="textBegin" select="string-join($tokenText[position() lt ($tokenNum - $lookBacklevel + 1)],' ')"/> <xsl:variable name="textEnd" select="string-join($tokenText[position() ge ($tokenNum - $lookBacklevel + 1)],' ')"/> I then search for a matching definition : <xsl:variable name="matchingAncres" select="$ancres[normalize-space($textEnd)!=''][igs:match-ancre(.,$textEnd)]" as="element()*"/> (matching rules are defined in a specific function) The problem I've got is that the tokenize separator is too specific, it's only a space, and sometime words are separated by other char like : - unbreakable space " " - open parenthese "(" - french quotes "B+" - ... I could use a regex like "[\s(]B+" as 2nd arg of tokenize() but, I will then not be able to reconstruct the string. So is there a way to get the separator that has been match in the regex of tokenize() ? just like regex-group() do when using <xsl:analyze-string> ? I think the answer is "no", but maybe I'm missing a trick to achieve this ? I could maybe use <xsl:analyse-string> but this is not so easy because of the reccursiv template, the regex will depend on $lookBacklevel param. I'm not sure I can fin the good pattern... Regards, Matthieu. -- Matthieu Ricaud 05 45 37 08 90 IGS-CP, service livres numC)riques
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|