[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Your XML documents may use different sets of characters, dependingon whi

  • From: "Costello, Roger L." <costello@mitre.org>
  • To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
  • Date: Tue, 17 May 2011 09:10:50 -0400

Your XML documents may use different sets of characters
Hi Folks,

The XML specification lists the set of characters that may be used in XML documents.

The characters are Unicode characters.

Unicode has something called categories. A category is a set of characters.

Here is a category: Nd

The Nd category consists of decimal digit characters.

Unicode is an evolving standard. Thus, there are different versions. 

The set of decimal digit characters in the Nd category may vary, depending on the version of Unicode.

The XML specification says that XML documents can use the characters in the Nd category.

But, but, but, ...

The characters in the Nd category may vary, depending on the version of Unicode. Do we have an ever-changing base of characters that are permitted in XML documents?

The XML specification mandates version 2.0 of Unicode.

Phew! That removes the variability in set of characters that may be used in XML documents.

But wait! 

There are XML applications that build on top of XML. And some of those applications are lax about which version of Unicode must be used. For example, with XML Schema:

    As far as conformant processors are concerned, the spec offers 
   implementers freedom to choose which version of Unicode they will 
   support. So if the definitions of character groups like Nd change from 
   one Unicode version to the next, this may be reflected in differences 
   between schema processors. [1]

So, one XML Schema validator may support Unicode 2.0 and another XML Schema validator may support Unicode 2.1. Suppose that in Unicode 2.0 there are 600 characters in the Nd category and in Unicode 2.1 there are 610 characters in the Nd category. An XML instance document may validate against one validator and fail against another. 

Ouch!

Are other XML applications similarly lax, permitting implementers to pick which version of Unicode they will support?

Does the XSLT spec allow implementers freedom to choose which version of Unicode they will support?

Does the UBL spec allow implementers freedom to choose which version of Unicode they will support?

Does the RELAX NG spec allow implementers freedom to choose which version of Unicode they will support?

Does the XBRL spec allow implementers freedom to choose which version of Unicode they will support?

Does the SVG spec allow implementers freedom to choose which version of Unicode they will support?

/Roger

[1] http://lists.w3.org/Archives/Public/xmlschema-dev/2011May/0024.html 




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.