Re: Processing XML 1.1 documents with XML Schema 1.0 processor
Eric van der Vlist wrote: > On ven, 2005-05-13 at 11:52 +0100, Michael Kay wrote: >>With all these things, I think one has to ask what is the approach that >>causes the least amount of pain to the average user. Asking everyone to >>change a namespace URI so that a few users can identify clearly whether or >>not their patterns are intended to match Ethiopian letters isn't a net win > > Only those whose pattern are intended to match Ethiopian letters would > have to change the namespace URIs and that should reduce the number of > such users by several orders of magnitude ! I beg to differ Eric, when I use a string or a sequence of name characters I want it to be just a damn string and the last thing I want to think about is whether it will be usable in Ethiopian, Myanmar, Khmer, or Mongolian. I don't want the users of my specification/schema/tool to have to figure out for themselves (or to ask me) whether they can use the Katakana middle dot in Japanese element names or not. A string, a name character, a white space character within an electronic document MUST be recognized as such according to the current state of the art. It MUST be able to be whatever the latest version of Unicode says it is. Of all people *we* should know that the encoding of text on a global scale is not a static science, it evolves and needs to evolve as Unicode improves. Yes this implies a phase during which XML processors may lose some interoperability, but whoever puts XML interoperability above human language operability needs to have their priorities seriously revised. Yes this may break software that is making stupid assumptions about the content of certain tokens, but such software was written based on a misunderstanding of text and deserves to break (and then to be shot in the kneecaps, tied to a horse and dragged all around town, dipped in boiling lead, dismembered piece by piece with a rusty spoon, and finally dumped in a ditch to agonize). XML is about text dammit, and text is meant to encode something very much alive called languages. It will change and it will move, under the effect of both language evolution and of the progress made by the Unicode Consortium in encoding more and more of it -- a task of gargantuan proportion comparable to the attempts at mathesis that all had given up on. Anyone expecting it to be different is still living in a legacy US-ASCII world that just happens to have a larger set of characters. How can XML be the universal data format without the ability to handle universal text? Heck, it's SGML for the *WORLD WIDE* Web we're talking about, not a falsely ubiquitous data interchange format for big American companies. -- Robin Berjon Research Scientist Expway, http://expway.com/
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format