[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: recognize character entities

Subject: Re: recognize character entities
From: Abel Online <abel.online@xxxxxxxxx>
Date: Wed, 30 Aug 2006 11:41:31 +0200
regex entity
Florent Georges wrote:
<xsl:variable name="entity.values"
select="('&#65533;...', '&#65533;...', ...)"/>
Perhaps it is easier, if I may suggest so, to use regular expressions. I think they would require a lot less work to create, because often the character entities used for MathML are inside ranges. Looking around at the entity tables on http://www.w3.org/TR/2003/REC-MathML2-20031021/chapter6.html#chars.entity.tables, I found that most sets are more a less complete parts from the Unicode 4.0 specification.

For instance, almost all characters in the range 0x02200 - 0x022FF are included (Mathematical Operators subset in Unicode). The regular expression for this is: [\x2200-\x22FF]. I'm not sure if processor dig this too: Mathematical symbols ought to be matched with the simple expression: \P{Sm}.

Similar constructs are available for Greek and Cyrillic: \P{IsGreek} and \P{IsCyrillic}.

Some ranges may be too wide, but perhaps there is little chance your code contains symbols not used by MathML, but available to Unicode.

Some characters are specified by MathML with a combining diacritical mark. I think you will have to list them separately in your regular expression. Same is true for the "normal" Latin-1 characters that are part of MathML, like &amp;, &aacute;, &Acirc; etc.

Using this approach you do not have to wonder if a characther entity is written using its numeric equivalent, the hexadecimal notation or the named notation.

Of course, it will take a few hours to construct your regex, but I think it will be much easier to maintain than a list of all entity values. And, forgot to say, you can only use it with XSLT 2.0 capable processors.

Hope this helps,

Cheers,
Abel Braaksma
http://abelleba.metacarpus.com

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.