[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: The XML 1.1 Candidate Recommendation is published


Re:  The XML 1.1 Candidate Recommendation is published
Hmm.

On Wed, 2002-10-16 at 06:56, Elliotte Rusty Harold wrote:
> C0 control characters such as form feed, vertical tab, BEL, and DC1
> through DC4 (whatever those are) are now allowed in XML text. However, they
> must be escaped as character references. They cannot be included literally in
> data. Nulls, thankfully, are still forbidden.

Why this is I don't understand.  If you're allowing all sorts of control
characters, forced encoded, what difference would it make to allow a
null?  Either the things stay safely encoded, in which case null is no
different than the other controls, or they don't, in which case null is
no different than the other controls.

> The C1 control characters such as BPH, IND, NBH, and PU1 are no longer
> allowed as literals in XML text. They too must now be escaped as character

I like this, in some ways.  If controls are going to be allowed at all,
then they should be handled *somehow*, and encoding seems to be the
choice of the moment.  I at least like the idea that C1 is to be treated
with the same disdain that C0 gets.

> references. For the first time this means that some well-formed XML 1.0
> documents are not well-formed XML 1.1 documents. The exception, of course, is
> IBM's holy grail of NEL, which will be allowed in literal XML text, just to
> make life difficult for every text editor on the planet except those from IBM
> mainframes.

Here, I get confused.  I went and looked at the 1.1 spec.  There's a
change to the discussion of line endings, which suggests that #xD #x85
and #x85 and #x2028 get normalized to #xA.  Like #xD #xA or #xD followed
by anything else.

However, the production for S is not changed, so although these things
participate in line endings, they aren't space characters.  Is that
correct?

If the answer is "it doesn't matter, line end processing happens before
checking for space," then the S production still ought to be changed
(for clarity), to remove #xD, which is as can't-appear in that situation
as any of the new bits.  But it makes more sense to me that anything
considered to be part of a line ending ought to be listed in S, which
would become: #x9 #xA #xD #x20 #x85 #x2028.  I don't understand the
inconsistency.

But the whole thing seems to be nearly as weird as the Namespaces 1.1
rec, which seems to think that because the only way to have no namespace
is to allow undeclaration of the default namespace, then named prefixes
also ought to be undeclared.  Pure hobgoblin: foolish consistency.

Amy!
-- 
Amelia A. Lewis       amyzing@t...      alicorn@m...
The law, in its majestic equality, forbids the rich as well as the poor
to sleep under bridges, to beg in the streets, and to steal bread.
                -- Anatole France, "Le Lys Rouge"

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.