[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Stupid Question (was RE: XML doesn't deserve its
I think the main problems are : 1) The fact that adding type information creates a data model much more complicated than the Infoset. People have to agree about types, about their names and their semantics. Your example is OK provided everybody knows what is an "Int". I think XML Schema so-called "simple" types are a precious thing here. Why not use the xsi:type notation, while we're at it ? For complex types, this is way more complex. You need a structure definition, and you won't be able to embed it inline. You can still use xsi:type, but you'll need a schema document to refer to. The biggest problem is that people have still to understand that the "syntax" part of XML is minute compared to the "tree, infoset, typed infoset" part. A syntax for labeled trees is nothing. As soon as you start to care about manipulating those labeled trees efficiently, a lot of other concepts enter the dance. So type information, be it inline or outline, will require many people to change their mind about the "T" world (remember this old thread ?). 2) Obviously, there is a verbosity and readability problem. This of the document-oriented XML case. Typing can still be useful, but how are you going to convince people that already scorn typing tags that they'll have to type tags AND type information within those tags ? 3) What about tags like <adress my:type="address">...</address> ? I'll answer your stupid question by an even more stupid one : what is the difference between a tag name and the type of its element ? I guess tag names qualify the role of the relation between a tag and its direct ancestor, while types are associated to schema or pattern that put a certain amount of constraints on the content of the element. But is that sure ? 4) More generally, to reply to your question and my own stupid question, I think we should not think immediatly of a way to represent type names. Before, we should think about what are types FOR. For a fresh start, let's forget about validation and focus on extensibility. My intuition is, we need some kind of meta-data to process XML data in an extensible way. What would help me writing programs that can gracefuly handle new document structure, or at least programs that can easily be extended to support those new document structures ? To begin with, knowing that I can fetch the same set of labeled data from any given element in the old and new document structure is a plus. This means that when I write some code which processes some data, I want this data to respect a kind of contract which states that whatever the structure of the data, I can obtain a particular view with a fixed labeled tree structure. This contract allows me to write some procedural code, because in procedural or OOP language, there is a strong binding between code and the data it manipulates. New code can use the new document structure, because its contract is extended to a new view with a different labeled tree structure. This "contract" between the code and the data is a precious piece of meta-data to use when writing program that can handle extensibility. Architectural Forms were built with this "contract" idea, AFAIK, as an architecture can be used to build a "view" over elements. Note that AFs are expressed by special attributes, so it's a bit like "inline type information", except that DTD default attributes can be used to keep the document less verbose. The question remains whether AFs can provide all the flexibility we need to support any kind of schema evolution. The situation in which some legacy code needs <firstname> first, then <lastname>, and a new document in which those elements are in reverse order cannot be handled by AFs, for example. Anyway, OOP provides interesting concept regarding "contracts", "extension of contracts" etc. Those are public interface of classes (or pure interfaces) and inheritance. A schema language like XML Schema also support those concepts, yet it is very difficult nowadays to access to schema information from a random XML document (or even from an XML document with an associated schema, because PSVI is not widely supported for now). Then, I want to know, when encoutering a new document structure, if it is safe for me to process the document. Maybe the <nuclear-plant:stop-cooling-core/> tag has now a just-kidding="true" attribute in the new document structure. In this case, my old program should not be able to process those documents, even if their structure is seemingly compatible : just ignoring the "just-kidding" attribute would have some dire consequences... This means that independently of a definition of structure, I need some meta-data that gives us versioning information, compatibility guidelines, or general semantics information. The need for general semantics can be illustrated by an XML document used to send command to toy trains. Even if the document structure was exactly the same for real trains (if those could receive XML documents as command, and pigs could fly while we're at it), I wouldn't want my toy train commands to be interpreted by real trains. Tag names or structure are of no help there : I need something to be able to distinguish a document which really means what I expect it to mean. "document type" is the first name that comes to my mind for this... I really think that we should think about how types can be used to help use achieve true extensibility. Types are not only for validation, they are a true help for extensibility. People seem reluctant to systematically append a type system including OOP concepts like interfaces ans inheritance over XML. The problem is that without this, and without extensibility guidelines, XML does effectively not deserve its 'X'. Regards, Nicolas >-----Message d'origine----- >De : Mike Champion [mailto:mc@x...] >Envoyé : mardi 5 mars 2002 19:26 >À : xml-dev@l... >Objet : Stupid Question (was RE: XML >doesn't deserve >its "X".) > > >3/5/2002 12:35:49 PM, Nicolas LEHUEN <nicolas.lehuen@u...> wrote: > >> >>That's what I was suggesting. However, I don't see how this >can be achieved >>without adding type information (AKA PSVI) to XML elements, and have a >>typing system that supports extensibility. Looks like we're >reinventing OOP >>there, with XML as a data serialisation format. > >This is in the spirit of "if we were doing this all over again ..." (or >"if we were furry little creatures eating dinosaur eggs and >planning for >the post-asteroid world ...), not a troll: >Why does XML carry around a label for every data value rather than >getting it from an out-of-band "schema" (a la EDI or >ASN.1), but then use an out-of-band means to associate type >information, >thus necessitating the PSVI? > >In a programming language, we say > > class MyData { > Int foo; > String bar; > Date baz; } > >Serializing an instance to XML gives: > <myData> > <foo>0xffffffff</foo> > <bar>Someday/bar> > <baz>20371031</baz> > </myData> > >Why not just put the type information inline and >make XML more "self-describing" (please don't >shoot me ...) > <myData> > <foo my:type="Int">0xffffffff</foo> > <bar my:type="String">Someday/bar> > <baz my:type="Date">20371031</baz> > </myData> > > >or else just give it up and use >ASN.1 for both the out-of-band >label and type information ? > >I'm sure this is a religious war I missed, somehow ... >and like I said, it's a stupid question, please be >merciful. > > > >----------------------------------------------------------------- >The xml-dev list is sponsored by XML.org <http://www.xml.org>, an >initiative of OASIS <http://www.oasis-open.org> > >The list archives are at http://lists.xml.org/archives/xml-dev/ > >To subscribe or unsubscribe from this list use the subscription >manager: <http://lists.xml.org/ob/adm.pl> >
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|