[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: {} quantifiers in regex
Geert Bormans wrote:
The regex attribute of analyze-string is an AVT. Now accolades have a special meaning in both an AVT and a regular expression and to use an accolade in any AVT without it being interpreted as the start/end of an expression is to double it. Because accolades are are use often in regexes and because their contents is usually a number, the result is not an illegal AVT: \d{2} is interpreted as the regular expression: \d2 which will quite likely match sometimes and sometimes not, but not when you want it. The resulting behavior has all the features of a buggy regular expression parser which in fact is a buggy expression itself... ;) Because I used to make this mistake often (and because escaped quotes and doubled accolades look ugly), I started to put the regular expression into a variable in all but the most trivial cases. The added benefit of this is that you can now use comments in a regular expression: <xsl:variable name="regex" as="xs:string"> \d <!-- a digit --> {2} <!-- must occur twice and only twice --> </xsl:variable> <xsl:analyze-string regex="{$regex}" flags="x"> ... </ Note the use of the 'x' modifier, which is necessary here. Regular expressions have the tendency to be the most unreadable of existing mini-languages, so comments and whitespace are often very welcome. The as="xs:string" is there because we don't need a document node but a string. For the fun of it and to complete this little story, note that in the world of obfuscation a lot is possible, if you set your mind to it. If you want it and you like fun code, you *can* put comments inside a regular expression (but only inside an AVT) using the following, imo rather silly construction: <xsl:analyze-string flags="x" regex=" \d {()(: a digit :)} {{2}} {()(: must occur twice and only twice :)}"> The () is because an xpath cannot be an empty string. The (: and :) are, of course, the comment delimiters for an XPath 2.0 expression. I don't know about other's opinions on this, but from my point of view, this doesn't add much to readability, so I still prefer the "best practice" of putting the regex in a variable (what aids to that decision is that some XSLT 2.0 processors do not allow the smiley comments). Cheers, -- Abel Braaksma
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|