[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: XSLT script to report Unicode characters and code

Subject: RE: XSLT script to report Unicode characters and code blocks in file?
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Fri, 30 May 2008 13:01:21 +0100
RE:  XSLT script to report Unicode characters and code
> Take alpha U+0391.  The UCD says that is Lu so it matches 
> \p(Lu) but that just tells you it's a lower case letter, it 
> doesn't tell you it's in the block
>       <block start="00370" end="003FF" name="Greek and 
> Coptic"/> does it? 

That's true if by "UCD" you mean the UnicodeData.txt file. But that's only
one of the files in the Unicode database; another file is blocks.txt which
does contain the required information.

Incidentally, the current version of blocks.txt does not exactly match the
names of the blocks as defined in schema (and XPath) regular expressions,
for example 0370..03FF was once "Greek" but is now "Greek and Coptic". The
Schema WG is close to deciding that the Unicode names are definitive, which
means that regular expressions become invalid when Unicode decides to change
the names of the blocks...

Michael Kay

Current Thread


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.