[Home] [By Thread] [By Date] [Recent Entries]
> I am having a hard time accepting the case for mixed content, especially > based on the arguments I have seen. One important reason is internationalization. Japanese, in particular, has too many homophones and variant readings to make either syllabically-spelled words or ideographically-written characters completely satisfactory. The common "writing-on-the-hand- when-ralking" behaviour that strikes foreigners in Japan is evidence of this. To overcome this, Japanese have adopted a system of annotated writing, which we can call Ruby (after the 4? point characters.) These allow ideagraphs (whose meaning may be readable but pronunciation unclear) to be coupled with their phonetic spelling. Or to allow contractions to be spelled out, or even little translations of unusual foreign words or names to be given in the text. Similar annotations are also used by Taiwanese with the bopomofo syllabary used for teaching children and with rare ideographs. One of the promises of XML over 3rd normal form data is therefore that mixed content provides a way for Japanese people (etc) to use their traditional Japanese solution (ruby annotations) and overcome the alphabet-centricism of RBDBS and third normal form. Some internationalization people even go as far as saying that *all* text in a schema intended for international use should be mixed content. I.e. that XML's string type should be the exception, to be used only when the pattern facet is used to disallow Han ideagraphs. Obviously, this can freak out RDBMS people. But why should East Asians settle for text in databases being less comprehensible than text in free text, in ways that alphabetic scripts are not? Cheers Rick Jelliffe
|

Cart



