[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Some comments on the 1.1 draft

From: "Michael Rys" <mrys@m...>
> Well, that may have been the original XML 1.0 use, but looking at where
> XML is currently having the most traction (SOAP, Messaging, WebDav,
> database serialization etc), this has changed.
One big advantage of disallowing control characters from XML documents
and silly characters from XML names is that it catches most common encoding errors.

For example, the very common problem of data labelled ISO 8859-1 containing
a 0x85 byte (for the Euro character).

At the moment XML provides the only disiplined point in the processing chain:
when data is in XML one *must* have the encoding correct.  This may
cause some consternation to us programmers, who perhaps have lived in a fool's 
paradise where encoding does not matter, but it is a fundamental point
of Quality Control for XML documents and exposes data corruption at the point
where it can be corrected.

To allow control characters would make us sink back into the horrible mess 
that everyone familiar  with working in multi-character set environments without 
XML is well aware (or, at least, becomes well aware when everything comes
crashing down).   

Most DBMS systems do not perform any checking of encoding. So you
can store almost anything in, say, a DBMS expecting ISO 8859-1.  With
a world full of data incorrectly labelled, there is no chance of good 
interoperability without some basic checking. And those basic checks
are what XML's data character and naming rules provide. 

Without them, sure XML would be "simpler" and we could attempt to transmit
arbitrary strings around. But then encoding detection or repair would be
the problem of the recipient and not the sender: a responsible recipient 
can have no faith that their non-ASCII data has not been corrupted.

And that lies at the heart of the matter: if we allow control characters
and silly name characters, we won't actually increase the number of
characters that can be reliable sent: we will just make non-ASCII 
characters suspect and unreliable.  

Rick Jelliffe


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.