|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: UTF-8+names
--- "Simon St.Laurent" <simonstl@s...> wrote: > > I don't know about this. I have to say that it feel > fundamentally > corrupt to me, effectively cheating on the > separation between character > encodings and the representation of characters in > those encodings. No more than any other non-trivial encoding, which gets in to deep philosophical territory about whether "ö" is the same as "oe" or whether the a Chinese character that looks like and has the same root meaning as a Kanji character is the "same" character or not. Or whether the Unicode NEL character is a "standard" newline character because it is mainly used in those mainframes we like to pretend don't exit :-) Tim's approach is taking a real, widespread problem and offering a clean, layered solution -- essentially a character encoding preprocessor -- rather than changing XML itself. Actually, a similar idea came up at the Binary Infoset workshop, to leverage/exploit the fact that the XML spec allows an open-ended set of encodings. This allows experimentation WITHOUT "corrupting" the core spec with support for local languages, stuff of interest mainly to mainframes, or more efficiently transmittable and/or lexable serializations. If these experiments succeed, they can propagate in software without changes to the XML spec; if they fail, they can do so without burdening standards-compliant processors with their lost baggage. There is some problem with interoperability, but no more than there is with other encodings that aren't widely supported outside some language community, and much less than with the common problem with the "smart quotes"! > > That is therefore an enormous processing model > change. I don't see it that way. It's leveraging the encoding mechanism to do what XML munges up with all the other things that DTDs do. I'm looking forward to using it outside of XML, e.g. to enter accented/umlaut characters without having to remember the platform-specific keyboard hacks or turning on keyboard modes that have all sorts of annoying side effects. How many times have I typed "uber" or "Godel" because it's too hard to remember how to enter "über" or "Gödel" [this time I copied/pasted]! > I can't say I expected XML to drill into the > Unicode layer and > modify the very notion of a character encoding. IMHO, it extends the Unicode encoding layer upwards to remove a wart in XML, not vice-versa. Anyway, I think this is a great idea, and I congratulate Tim for working it out and moving it forward.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








