|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: W3C XML Core WG requests comment: control characters in XM
John Cowan wrote > This is a request for comment from this mailing list (or anyone else) > on a proposal by Shigemichi Yazawa for a standard representation for > the Unicode control characters that are not legal in XML 1.0. See > http://lists.w3.org/Archives/Public/www-xml-blueberry-comments/2002May/0000. html > > In essence, this provides an element "<xml:orphanedChar value="#x0001">" > which can be used *by convention* in place of an actual (and illegal) #x1 > character. The Infoset would view this as an element, not a character; I'm not too keen on this proposal, even though it does have some merit. The idea here is to represent a character using an entirely different infoset item. This hack enables the application to by-pass the xml character rules but it is nevertheless a hack. I see no reason why this should be adopted as part of the XML recommendation - if individual applications wish to obfuscate control characters in this way there is nothing stopping them from doing so in XML 1.0. In addition, I'm not happy with the way this proposal creates a distinction between attribute values and element content. Many people draw little distinction between the two, with the obvious exception that attribute values do not have structured content. I think that a W3C recommendation should create a mechanism that is suitable for both element content and attribute values. Furthermore I would like to see existing mechanisms used where possible. Has the Core WG exhausted all possibilities surrounding the idea of using character references (eg &x05;) ? I presume the idea has been dropped because of the need to protect existing applications that cannot handle control characters. But if you leave out 0x00 (which has well-known mishevious properties) then I think most applications will be able to handle the other characters without problem. Additionally, should the few XML 1.0 applications that would stumble on control characters in text be permitted to block the progress of XML? Surely, if a class of application exists which requires the new features of XML 1.0, but cannot handle control characters, then it could be catered-for independently by parser vendors (perhaps by making a processing switch available to treat control characters.as not well-formed even in 1.1 entities). Please, let's not allow blatant hacks into the XML recommendations. > > An alternative proposal is to use a processing instruction such as > "<?xmlchar #x1?>", which would be invisible to schemas. No Regards ~Rob -- Rob Lugt ElCel Technology http://www.elcel.com/
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








