[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Where does a parser get the replacement text for a characterreferenc

  • From: David Brownell <david-b@p...>
  • To: Lars Marius Garshol <larsga@g...>,xml-dev <xml-dev@l...>
  • Date: Wed, 04 Jul 2001 18:33:39 -0700

component transcode
> | I assume that it would depend on what encoding the xml that you are
> | parsing has.
> 
> Actually, no.

More like:  "sort of yes".  Java developers tend to assume Unicode is
the universal way to represent character data, but folk working in other
languages may not be so fortunate.  Parser APIs aren't required to
transcode into a UTF (UTF-8, UTF-16, UTF-32); they may deliver
characters in other encodings, including the input encoding.

Using the original U+E311 private-use character as an example,
it could be natural to have some component transcode it to the
local character set.  That may be preferred for Klingon, or for
other characters that don't have code points in Unicode.  (A while
back, I think Taiwan needed to use that approach; dunno if that's
less of an issue in 3.1 Unicode.)

>     Character references always refer to Unicode characters.

Or surrogate pairs -- they refer to ISO-10646 characters, which can
be represented in Unicode as one or two 16-byte units.  It's explicitly
illegal to have references to surrogate pairs, but characters in the
"Astral Planes" expand to two UTF-16 characters (or one UTF-32).

- Dave



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.