[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XSLT 2.0 : Unicode hex notation in regular expres

Subject: Re: XSLT 2.0 : Unicode hex notation in regular expressions
From: David Carlisle <davidc@xxxxxxxxx>
Date: Thu, 12 Aug 2004 12:14:56 +0100
hex notation
> Sorry to insist : why don't they work ?

Because that's life:-)

>  Aren't they supposed to do ?

No the syntax in xslt is (except where otherwise noted) that of w3c xml
schema, and that doesn't have any notation like that.

> If so, is it a Saxon-related problem or a more general one that would 
> indicate that UTS #18 is still to be implemented, is irrelevant or 
> whatever ?

The _semantics_ of unicode regexp comes from there eg the predefined
character classes (you may prefer to use a character class refering to
the arabic block for example rather than use explict code points) but (I
would guess) the U notation wasn't supported as that is the unicode
standard way of accessing characters by code point reference in plain
ascii text and that is never used in an XML context. U+06FF is legal XML
character data but it is those 6 characters, if you want to refer to
character hex 06ff you always use & # x 0 6 F F ; in XML.


  How, for example, to use a useful syntax like 
  matches(.,'\p{Script:Arabic}+') ?

schema-2 says: http://www.w3.org/TR/xmlschema-2/#regexs

[Definition:] [Unicode Database] groups code points into a number of
blocks such as Basic Latin (i.e., ASCII), Latin-1 Supplement, Hangul
Jamo, CJK Compatibility, etc. The set containing all characters that
have block name X (with all white space stripped out), can be identified
with a block escape \p{IsX}. The complement of this set is specified
with the block escape \P{IsX}. ([\P{IsX}] = [^\p{IsX}]).
...
For example,
the 7block escape7 for identifying the ASCII characters is \p{IsBasicLatin}. 



so that would be \p(IsArabic)

David

________________________________________________________________________
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.