|
[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: [XSLT2.0] xsl:analyze-string@regex syntax too limi
>>>>> "Gunther" == Gunther Schadow <gunther@xxxxxxxxxxxxxxxxxxxxxx> writes:
Gunther> The boundary matcher matches a zero-width substring
Gunther> between a character matching the character class
Gunther> [A-Za-z_0-9] and a character matching the character class
Gunther> [^A-Za-z_0-9] or vice versa. </quote>
Gunther> This is pretty clear. It may not make the
Gunther> internationalization people very happy because I can't do
Gunther> word-boundary matches on Hindi text. That's a true
Gunther> concern.
So address it. Unicode report TR18 says (for Level 1 support):
RL1.4 Simple Word Boundaries
To meet this requirement, an implementation shall extend the word boundary mechanism so that:
1.
The class of <word_character> includes all the Alphabetic values from the Unicode character database, from UnicodeData.txt [UData]. See also Annex C: Compatibility Properties.
2.
Non-spacing marks are never divided from their base characters, and otherwise ignored in locating boundaries.
Level 2 provides more general support for word boundaries between
arbitrary Unicode characters which may override this behavior.
Level 1 support should certainly be met.
--
Colin Paul Adams
Preston Lancashire
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|

Cart


![Re: [XSLT2.0] xsl:analyze-string@regex syntax too limi](/images/get_stylus.gif)





