[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Re: backticks in regex - tales of the unexpected p
On 7-4-2014 18:35, Ihe Onwuka wrote: > Just going by the definition of the \w class in MK's XPath 2.0 > reference - \w -> a character considered to form part of a word Essentially, it is the Unicode standard that defines whether something is outside of \p{C}, \p{Z} or \p{P}. And I would find it rather strange is "accent grave" would _not_ be considered a possible part of a word, similarly to diaeresis, breve, cedilla etc. The counterpart, the acute accent, is categorized the same. But not apostrophe, which is often considered an acute accent, but really isn't. I understand the confusion: consider the math and currency symbols, from the same XSLT book you are quoting, it tells you that they are part of it as well. How is $, + or > a word character? I don't know. I guess the Unicode consortium just had to draw the line somewhere. > So it's TS if backtick isn't a word character in your vocabulary. > Probably neither the first or the last to get caught by that one. Not sure what TS means. But I'm sure you are not the last to get caught by that one. Personally, I hardly ever use \w because I find it very hard to understand what it does and does not match. The following is word? Tell`>me$45). I find it easiest to define the subranges myself, or use the \p{Category} syntax, which I find clearer. Cheers, Abel
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|