[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XML Max Character Value


char values
On Aug 13, 2005, at 14:19, Alan Gutierrez wrote:

>     Am I seeing that with Unicode in Java, you need to work with
>     String and not with individual char? That puts a dent in my
>     algorithm, which advanced along the characters in the string.

It depends on what exactly you are doing. A Java char is not a Unicode 
character but a UTF-16 code unit. The values \u0000 and \uFFFF should 
never occur in XML and can be used as sentinels if your algorithm works 
on UTF-16 code units. For the purpose of indexing text, working on 
UTF-16 code units as opposed to working on Unicode characters may well 
be good enough. In that case, a surrogate pair can be treated as two 
adjacent "characters". (Note that even when operating on UTF-32, you 
can have tightly-coupled characters when there is a base character 
followed by combining marks, so working on Unicode characters does not 
buy you inter-character independence.)

-- 
Henri Sivonen
hsivonen@i...
http://hsivonen.iki.fi/


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.