RE: XSLT script to report Unicode characters and code
> Take alpha U+0391. The UCD says that is Lu so it matches > \p(Lu) but that just tells you it's a lower case letter, it > doesn't tell you it's in the block > <block start="00370" end="003FF" name="Greek and > Coptic"/> does it? That's true if by "UCD" you mean the UnicodeData.txt file. But that's only one of the files in the Unicode database; another file is blocks.txt which does contain the required information. Incidentally, the current version of blocks.txt does not exactly match the names of the blocks as defined in schema (and XPath) regular expressions, for example 0370..03FF was once "Greek" but is now "Greek and Coptic". The Schema WG is close to deciding that the Unicode names are definitive, which means that regular expressions become invalid when Unicode decides to change the names of the blocks... Michael Kay http://www.saxonica.com/
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format