[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Multi-lingual experiment - a call for action
Didier wrote : > It make sense to have a DTD in each language so that, people can experiment > translating form one language to an other. Do not forget that the experiment > is about the result of a database. I suspect that we have a slight disagreement though about what we mean by 'translation'. It seems perfectly natural to me to want to translate a document's character content, that content being a necessary and crucial part of its meaning. But as long as we restrict ourselves to a DTD, rather than using - say - Schema, isn't "translating" the DTD more a matter of *structure* than one of content ? Documents with different DTDs will necessarily be of different types; we can *transform* one such type to another, but can we say that in doing so we have performed a translation ? My answer is no - because the essence of 'translating' is not only to map "STL Tutorial" to "Une introduction à STL" but also to say that a "title" in english is the same thing as a "titre" in french. This can't be done, as far as I can tell, with a DTD since it does not have the means of expressing equivalence between structural vocabularies. This is why I think this problem would make a perfect test case for a more sophisticated schema language such as XML Schema. > a) Example: translate from an XML document encoded with a French DTD into a > new XML document encoded in German for trading. Should I mention here that > this matter of fact will happen with a high probability mainly for exchange > and trade within the European community. I would argue that an important requirement here woud be that either the French or the German version of such a document should pass validation by the same parser. If you're hinting at a sort of "folder" of documents where one "multilingual" element could be the parent of a number of subelements each representing a different language version of the "same" content, then it seems to me that it would be desirable that each such subelement be, structurally, *equivalent* to any other, even if element names should differ. Example : <versions> <objet xml:lang="français"> <titre>Introduction à STL</titre> </objet> <item xml:lang="english"> <title>STL Tutorial</title> </item> </versions> With a (very partial) Schema as follows (if I understand Schema at all, that is, which might be far from the case...) : <schema targetNamespace="http://yo.com/polyglot"> <element name="T" type="T" abstract="true"/> <element name="titre" equivClass="T"/> <element name="title" equivClass="T"/> <element name="O" type="O" abstract="true"/> <element name="objet" equivClass="O"/> <element name="item" equivClass="O"/> </schema> In this case a single XSL transform expressed with (say) french element names could be used to output the French version of any <objet> contained within a <versions> folder, even if this <objet> is in fact an <item>... A rose by any other name, etc. (An interesting question is how the equivalence classes themselves should be named; maybe Esperanto...) > Please, use the accents since French includes accent. If I show you a > Japanese DTD (unfortunately most mails won't be able to decode UTF-8 > Japanese characters) you'll notice that the elements are full Japanese words > _not_ cut back ones. So please, include the accents so that it is french not > a language between two chairs. If we speak of multi-ligual let's be > multi-lingual. Anyway, don't bother, I'll add them. Yeah, accents seem to be allowed - looks like I read the spec wrong. Excerpted from the XML 1.0 spec: [45] elementdecl ::= '<!ELEMENT' S Name S contentspec S? '>' [5] Name ::= (Letter | '_' | ':') (NameChar)* [84] Letter ::= BaseChar | Ideographic [4] NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' | CombiningChar | Extender Then again, does the above mean an element name can't *start* with a diacritic ? That would rule out "éditeur"... It's all spelled out in the spec but I haven't gotten around to learning Unicode yet - I know I should ! ======================================== Laurent Bossavit - Ingénieur R&D >>> laurent@m... <<< >> ICQ#39281367 << MultiMania http://www.multimania.fr/ ======================================== *************************************************************************** This is xml-dev, the mailing list for XML developers. To unsubscribe, mailto:majordomo@x...&BODY=unsubscribe%20xml-dev List archives are available at http://xml.org/archives/xml-dev/ ***************************************************************************
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|