RE: [xsl] XSLT script to report Unicode characters and code

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

Subject: RE: XSLT script to report Unicode characters and code blocks in file?
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Fri, 30 May 2008 13:01:21 +0100

> Take alpha U+0391.  The UCD says that is Lu so it matches 
> \p(Lu) but that just tells you it's a lower case letter, it 
> doesn't tell you it's in the block
>       <block start="00370" end="003FF" name="Greek and 
> Coptic"/> does it? 

That's true if by "UCD" you mean the UnicodeData.txt file. But that's only
one of the files in the Unicode database; another file is blocks.txt which
does contain the required information.

Incidentally, the current version of blocks.txt does not exactly match the
names of the blocks as defined in schema (and XPath) regular expressions,
for example 0370..03FF was once "Greek" but is now "Greek and Coptic". The
Schema WG is close to deciding that the Unicode names are definitive, which
means that regular expressions become invalid when Unicode decides to change
the names of the blocks...

Michael Kay
http://www.saxonica.com/

Current Thread

Re: XSLT script to report Unicode characters and code blocks in file?, (continued)
- David Carlisle - 29 May 2008 21:13:58 -0000
  - Colin Paul Adams - 30 May 2008 05:41:30 -0000
    - Michael Kay - 30 May 2008 07:10:27 -0000
    - David Carlisle - 30 May 2008 11:47:22 -0000
    - Michael Kay - 30 May 2008 12:01:48 -0000 <=
    - Colin Paul Adams - 30 May 2008 12:42:28 -0000
    - David Carlisle - 30 May 2008 12:51:11 -0000

<- Previous	Index	Next ->
Re: XSLT script to report Uni, David Carlisle	Thread	Re: XSLT script to report Uni, Colin Paul Adams
Re: Know repeated values, Martin Honnen	Date	Re: Know repeated values, Martin Honnen
	Month

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >