[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XML Blueberry (non-ASCII name characters in Japan)

  • From: Elliotte Rusty Harold <elharo@m...>
  • To: xml-dev@l...
  • Date: Mon, 09 Jul 2001 17:28:45 -0400

ascii java
At 5:06 PM -0400 7/9/01, Simon "St.Laurent" wrote:
>On 09 Jul 2001 16:07:24 -0400, Elliotte Rusty Harold wrote:
>> At 2:52 PM -0400 7/9/01, John Cowan wrote:
>> 
>> >Please tell me what kind of argument you would find convincing.
>> >
>> 
>> Prove to me the existence of 10,000 or more users who want to write XML
>> *markup* in any combination of the scripts added in Unicode 3.0 and
>> 3.1, who cannot reasonably use an alternative script for their 
>> language of choice, and who do not read and write some better 
>> supported language. 
>
>Are those the criteria for Unicode?  

Of course not, and those shouldn't be the criteria for inclusion in Unicode because Unicode has totally different needs than XML markup. Unicode is about *TEXT*. We're talking about *MARKUP*. These points keep getting confused. To lose any of these scripts from text, would be a huge disadvantage. It would clearly disenfranchise far more than 10,000 users apiece. It would be a radical impoverishment of human culture. But nobody's arguing that.

One more time: every single one of the characters in question can be used in XML documents today. Want to publish a newspaper in Amharic using XML? No problem. Want to write poetry in Burmese? Not an issue. Want to take down oral history in Khmer? Go for it. 

If XML 1.0 prohibited, these characters from #PCDATA, then there'd be a much stronger argument for breaking compatibility. Indeed I probably would have suggested it myself quite a while ago, but it wasn't necessary then and it isn't necessary now. I remember when I first realized what the XML 1.0 BNF grammar did with the unassigned characters, and why it did that. Frankly I was shocked. I never would have thought of doing that. Fortunately the team that put together XML 1.0 did a much better job than I could have done, and a much better job than  you're giving them credit for now. XML 1.0 is fully adequate for any form of text in any of these languages, as well as a number yet to come. That's truly an amazing achievement. 

(FYI I have been advocating breaking backwards compatibility in Java for several months now over this issue because unlike XML, Java cannot use these characters in plain text.) 

>I don't think so, and I'm not
>really sure what cloud you picked 10,000 out of.  On the basis of 10,000
>people using markup, we can probably disenfranchise significant
>communities around the globe who use characters already recognized by
>XML 1.0.
>

The criterion isn't really 10,000. The criterion is enough users to justify the cost of transition. So far I've yet to see the existence of one such user demonstrated, much less 10,000. 

Again, if there were no additional cost to adding these characters, we wouldn't be having this conversation. But there is a cost, a real one, and one that's going to affect many people. For some people it will be a minor inconvenience. A few book authors like you and me may actually make money off this change. But a lot of non-experts are going to get hammered with unexpected incompatibilities they can't easily diagnose. 

>I'm deeply unconvinced that it's the job of the XML community to decide
>issues which seem far more likely to be understood by the Unicode
>community.

It's not the W3C's job to decide what characters should be allowed in text. It's not the Unicode Consortium's job to decide what characters should be allowed in  XML names. 

-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@m... | Writer/Programmer |
+-----------------------+------------------------+-------------------+ 
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|              http://www.ibiblio.org/xml/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      | 
|  Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/     |
+----------------------------------+---------------------------------+

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.