|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: CDATA by any other name... (was The raw and the cooked)
> From: John Cowan > Rick Jelliffe wrote: > > > (An optimistic view of ISO10646: there are dozens of new Han ideographs > > created every day, apart from other scripts.) > > True but irrelevant, since no specifiable character set can hold these. Not so. The additions are use composed of standard radicals and combinations. There are various projects around (such as C.C.Hsieh in Taiwan) to figure out encodings to "spell" Han ideographs by component radicals. This would allow any number of characters and even variant forms. But this is not in ISO 10646 yet. I guess the point is that John thinks that if an XML system can produce characters which a recipient system cannot process, because it does not use ISO 10646, that is not something that CDATA sections should be used to address. I think his reasons are that he cannot see it in the spec. Dave M thinks that xml:lang is appropriate. My point about CDATA elements was that there is no standard mechanism to lock CDATA marked sections. I think a lot of people now think that any non-ISO10646 system is for losers anyway (except for whatever character set they use, probably). > .. the repertoire of a language is > a sticky wicket. In the domain of "xml:lang='en-US'", am I to be > forbidden to write "naïve" or "coöperate"? How about "résumé" or > "Québéc"? The primary purpose of xml:lang, as far as I am concerned, should be to convey the information lost by ISO 10646 unification: where the Japanese and Chinese glyphs (or Polish and Russian) for a unified character differ, then I think transcoding and unifying the characters into ISO 10646 can lose information unless the xml:lang attribute is set. After that, xml:lang can be used to label text for the purposes of variant character selection, and after that for marking up the natural language. But I am not trying to fix the repertoire of a language (TEI WSD can declare it, though). I am just thinking about how to constrain XML documents so that they will not contain characters which will break non-ISO10646 target systems. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








