[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: W3C XML Core WG requests comment: control characters in XM


control characters in xml
John Cowan wrote

> This is a request for comment from this mailing list (or anyone else)
> on a proposal by Shigemichi Yazawa for a standard representation for
> the Unicode control characters that are not legal in XML 1.0.  See
>
http://lists.w3.org/Archives/Public/www-xml-blueberry-comments/2002May/0000.
html
>
> In essence, this provides an element "<xml:orphanedChar value="#x0001">"
> which can be used *by convention* in place of an actual (and illegal) #x1
> character.  The Infoset would view this as an element, not a character;

I'm not too keen on this proposal, even though it does have some merit.  The
idea here is to represent a character using an entirely different infoset
item.  This hack enables the application to by-pass the xml character rules
but it is nevertheless a hack.  I see no reason why this should be adopted
as part of the XML recommendation - if individual applications wish to
obfuscate control characters in this way there is nothing stopping them from
doing so in XML 1.0.

In addition, I'm not happy with the way this proposal creates a distinction
between attribute values and element content.  Many people draw little
distinction between the two, with the obvious exception that attribute
values do not have structured content.

I think that a W3C recommendation should create a mechanism that is suitable
for both element content and attribute values.  Furthermore I would like to
see existing mechanisms used where possible.  Has the Core WG exhausted all
possibilities surrounding the idea of using character references (eg &x05;)
?

I presume the idea has been dropped because of the need to protect existing
applications that cannot handle control characters.  But if you leave out
0x00 (which has well-known mishevious properties) then I think most
applications will be able to handle the other characters without problem.
Additionally, should the few XML 1.0 applications that would stumble on
control characters in text be permitted to block the progress of XML?
Surely, if a class of application exists which requires the new features of
XML 1.0, but cannot handle control characters, then it could be catered-for
independently by parser vendors (perhaps by making a processing switch
available to treat control characters.as not well-formed even in 1.1
entities).

Please, let's not allow blatant hacks into the XML recommendations.

>
> An alternative proposal is to use a processing instruction such as
> "<?xmlchar #x1?>", which would be invisible to schemas.
No

Regards
~Rob

--
Rob Lugt
ElCel Technology
http://www.elcel.com/



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.