[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Your XML documents may use different sets ofcharacters, de

  • From: Liam R E Quin <liam@w3.org>
  • To: "Costello, Roger L." <costello@mitre.org>
  • Date: Tue, 17 May 2011 17:07:25 +0200

RE:  Your XML documents may use different sets ofcharacters
On Tue, 2011-05-17 at 10:49 -0400, Costello, Roger L. wrote:
[...]
> The following statements apply to "data" not to "markup" (i.e.,
> element names, attribute names).
> 
> 1. Except for unpaired surrogate codepoints and a few control
> characters, you can use any character you want in XML documents.

In particular, codepoint 0 is not allowed.

> 2. The characters don't have to be defined in the Unicode
> specification.

The codepoints do not have to have Unicode characters associated with
them.

> 
> 3. For characters that don't have a visual representation or aren't in
> the Unicode character set, you can use them  via XML's character
> entity mechanism, e.g., &#xffed;
You can do that with any allowed character, and you can also include the
character directly.

> 
> 4. Implementers of XML applications are free to choose which version
> of Unicode they will support. Thus, one implementer of an XML Schema
> validator may choose to support Unicode 2.0, while another implementer
> of an XML Schema validator may choose to support Unicode 2.1. One
> implementer of an XSLT processor may choose to support Unicode 2.0,
> while another implementer of an XSLT processor may choose to support
> Unicode 2.1.

Or the version of Unicode understood may depend on the operating
environment, e.g. on the Java VM in use.
> 
> 5. In XML applications that use regular expressions (e.g. XML Schema,
> XSLT), be careful about using regexes that contain regex categories
> such as Nd. The characters in those regex categories may vary
> depending on which version of Unicode an implementer supports. Thus,
> your application may execute without errors with one vendor's tool and
> fail on another.

That may be what you want, it turns out.  "When our system is upgraded
our schema is ready for it"...

> 6. CREPDL is a technology that allows you to precisely define the
> universe of characters that you want to allow in your XML documents.

You can also do this with an XSD facet.

Liam


-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://www.fromoldbooks.org/
Occasional blog: http://www.barefootliam.org/
The barefoot typographer





[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.