[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Some comments on the 1.1 draft

  • To: "Rick Jelliffe" <ricko@a...>,<xml-dev@l...>
  • Subject: RE: Some comments on the 1.1 draft
  • From: "Michael Rys" <mrys@m...>
  • Date: Wed, 19 Dec 2001 10:14:29 -0800
  • Thread-index: AcGIWlNTdhMt326BQC6wJfLeWz+qpQAXCfZw
  • Thread-topic: Some comments on the 1.1 draft

detecting iso 8859 1 xml
Rick, I don't have a strong opinion on the name encoding (since our
products and SQLX already use an encoding that is a valid 1.0 name). 

I don't understand your encoding issues though. I am mainly talking
about the Unicode code points. If somebody uses an encoding where U+85
is not a valid character, then it should error. If it is a valid
character but not the intended Unicode character, then it is an error
that a parser may not be able to detect (we certainly can get this even
in XML 1.0).

I can assure you that the database community has even more encoding
support than many XML processors (look up collations). 

Best regards
Michael

> -----Original Message-----
> From: Rick Jelliffe [mailto:ricko@a...]
> Sent: Tuesday, December 18, 2001 23:03 PM
> To: xml-dev@l...
> Subject: Re:  Some comments on the 1.1 draft
> 
> From: "Michael Rys" <mrys@m...>
> 
> > Well, that may have been the original XML 1.0 use, but looking at
where
> > XML is currently having the most traction (SOAP, Messaging, WebDav,
> > database serialization etc), this has changed.
> 
> One big advantage of disallowing control characters from XML documents
> and silly characters from XML names is that it catches most common
> encoding errors.
> 
> For example, the very common problem of data labelled ISO 8859-1
> containing
> a 0x85 byte (for the Euro character).
> 
> At the moment XML provides the only disiplined point in the processing
> chain:
> when data is in XML one *must* have the encoding correct.  This may
> cause some consternation to us programmers, who perhaps have lived in
a
> fool's
> paradise where encoding does not matter, but it is a fundamental point
> of Quality Control for XML documents and exposes data corruption at
the
> point
> where it can be corrected.
> 
> To allow control characters would make us sink back into the horrible
mess
> that everyone familiar  with working in multi-character set
environments
> without
> XML is well aware (or, at least, becomes well aware when everything
comes
> crashing down).
> 
> Most DBMS systems do not perform any checking of encoding. So you
> can store almost anything in, say, a DBMS expecting ISO 8859-1.  With
> a world full of data incorrectly labelled, there is no chance of good
> interoperability without some basic checking. And those basic checks
> are what XML's data character and naming rules provide.
> 
> Without them, sure XML would be "simpler" and we could attempt to
transmit
> arbitrary strings around. But then encoding detection or repair would
be
> the problem of the recipient and not the sender: a responsible
recipient
> can have no faith that their non-ASCII data has not been corrupted.
> 
> And that lies at the heart of the matter: if we allow control
characters
> and silly name characters, we won't actually increase the number of
> characters that can be reliable sent: we will just make non-ASCII
> characters suspect and unreliable.
> 
> Cheers
> Rick Jelliffe
> 
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> 
> The list archives are at http://lists.xml.org/archives/xml-dev/
> 
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.