Re: The XML 1.1 Candidate Recommendation is published
Hmm. On Wed, 2002-10-16 at 06:56, Elliotte Rusty Harold wrote: > C0 control characters such as form feed, vertical tab, BEL, and DC1 > through DC4 (whatever those are) are now allowed in XML text. However, they > must be escaped as character references. They cannot be included literally in > data. Nulls, thankfully, are still forbidden. Why this is I don't understand. If you're allowing all sorts of control characters, forced encoded, what difference would it make to allow a null? Either the things stay safely encoded, in which case null is no different than the other controls, or they don't, in which case null is no different than the other controls. > The C1 control characters such as BPH, IND, NBH, and PU1 are no longer > allowed as literals in XML text. They too must now be escaped as character I like this, in some ways. If controls are going to be allowed at all, then they should be handled *somehow*, and encoding seems to be the choice of the moment. I at least like the idea that C1 is to be treated with the same disdain that C0 gets. > references. For the first time this means that some well-formed XML 1.0 > documents are not well-formed XML 1.1 documents. The exception, of course, is > IBM's holy grail of NEL, which will be allowed in literal XML text, just to > make life difficult for every text editor on the planet except those from IBM > mainframes. Here, I get confused. I went and looked at the 1.1 spec. There's a change to the discussion of line endings, which suggests that #xD #x85 and #x85 and #x2028 get normalized to #xA. Like #xD #xA or #xD followed by anything else. However, the production for S is not changed, so although these things participate in line endings, they aren't space characters. Is that correct? If the answer is "it doesn't matter, line end processing happens before checking for space," then the S production still ought to be changed (for clarity), to remove #xD, which is as can't-appear in that situation as any of the new bits. But it makes more sense to me that anything considered to be part of a line ending ought to be listed in S, which would become: #x9 #xA #xD #x20 #x85 #x2028. I don't understand the inconsistency. But the whole thing seems to be nearly as weird as the Namespaces 1.1 rec, which seems to think that because the only way to have no namespace is to allow undeclaration of the default namespace, then named prefixes also ought to be undeclared. Pure hobgoblin: foolish consistency. Amy! -- Amelia A. Lewis amyzing@t... alicorn@m... The law, in its majestic equality, forbids the rich as well as the poor to sleep under bridges, to beg in the streets, and to steal bread. -- Anatole France, "Le Lys Rouge"
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format