[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: FYI: Announcement of a new I-D for XML media types

  • From: Murata Makoto <mura034@a...>
  • To: xml-dev@x...
  • Date: Wed, 10 May 2000 13:01:29 +0900

xml encoding utf16be
Rick,

Thank you for your comments.

Some of your comments (XML Base, UTF-16LE/BE) should actually go to
the XML Core WG of W3C and the ietf-charset mailing list.

> 1) Charset "should" be given for application/xml. HTTP has a character 
> set handling concept that comes from fantasyland. I would recommend a 
> very different policy: never use xml/data, always use application/xml; 
> never use charset, always use xml encoding declarations. 

RFC 2376 already has this attribute.  We cannot remove it.  

You do not have to use this attribute.  Then, you will get 
what you want.

I am willing to add arguments for and against this attribute to the
next I-D.  

> 2) "When non-validating processors handle XML documents, they do not 
>       always read external parsed entities. Thus, interoperability is 
>       not guaranteed." 
> This is just FUD: why isnt this handled by the "standalone" declaration. 
> If it is a comment about bugs in software, that is out-of-place here. 

If two processors emit different element structures, we do not have
interoperability.  Thus, XML 1.0 does not provide full
interoperability unfortunately.  (I keep kicking myself for not
voting against external parsed entities.)

standalone="no" provides helpful warning.  But it does not provide
interoperability.  It merely makes the problem explicit.

> 3) Support for xml:base.

If XML Base becomes a W3C recommendation, our upcoming RFC must
mention XML Base.  If it does not, it must be omitted.  The I-D
mentions it, since XML Base is in the last call phase.  You might want
to send your comments to www-xml-linking-comments@w....

I am personally not a big fan of xml:base.  It may be useful for
XLink&XPointer.  But XML Base is intended to affects *any* attribute
if application programs use it as a relave URI reference.  I do not
think all application programmers will agree to implement XML Base.
But I have had little support.

> 4) The rocket scientists at IETF have managed a new thing with the spec 
> for utf16be (if you use utf16be you cannot have a BOM apparantly): it 

This issue has been extensively discussed at Unicode Consortium 
as well as the ietf-charset mailing list.  We already have an 
RFC (RFC 2781).  

I believe that the consensus at W3C is to allow UTF-16LE/BE 
as optional charsets for XML (encoding declarations are requried, 
though.)  Again, you might want to argue at W3C (I18N IG and XML Core WG).
I am personally not a fan of UTF-16LE/BE either.

> 5) Along similar lines, but far worse and of major importance for 
> internationalization, the fragment identifier of a URI has to be in 
> US-ASCII with %HH escaping. Here I am in Taipei and I want to include 
> an Xpointer to refer to an ID or element name or attribute name or 
> value, and I have to first find the numeric values of my Big5, then 
> trancode it into Unicode, then find out what the Unicode values are in 
> HEX, then put them in. Is that the way it is supposed to work? 

No, it is not the way it is supposed to work.  Believe me, I would
never agree on such a horrible scenario!  (But I am willing to expand the
I-D to address your concern.)

You write a "URI reference" in Big5.  This is fine.  Before 
sending the "URI reference" to protocols such as HTTP, programs 
have to convert it to the %HH format.  In other words, %HH 
exists on the wire only.  When interpreting such URIs containing 
%HH, recipient programs have to reconstruct Unicode characters.

HTML 4.01 already recommend users agents to convert non-ASCII 
characters to %HH.  See
http://www.w3.org/TR/html401/appendix/notes.html#h-B.2.1

URI is defined by RFC 2396, and it allows a limited subset of the
repertoire of ASCII characters only.  (Since a URI is a protocol 
element, I think that this is acceptable.)  A new spec for IURI is 
being developed.  It is available at:

http://search.ietf.org/internet-drafts/draft-masinter-url-i18n-05.txt

> Good to see: 
> 
> 
> 1) |xml suffix is great idea 
> 
> 
> 2) MIME types for DTDs and external parsed entities 

Thanks.

Cheers,

Makoto
 
Internet: mura034@a...
Nifty: VEQ00625

***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@x...&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.