Re: A Plea for Schemas
Matthew Gertner wrote: > > I have written a short "XML Rant" Enjoyable. It is good to see some reasonable passion from a reasonable mind. Here is some rant for the rant. o "the 1980s, Charles Goldfarb invented SGML". Ok for a rant, but ISO created SGML. If any man can be said to have lead that work, it is Dr. Charles Goldfarb at IBM Almaden. He was a member of the IBM team (Goldfarb, Mosher, Lorie) that designed, GML. To the idea of GenCodes, GML added among other things, type-defined namespaces for markup. GML and research were combined to propose and ratify ISO 8879. Invention like that is a community process. Dr. Goldfarb leads that community. In the late 1960s, publishers needed a means to exchange working files. A solution proposed at that time, GenCodes, was supported. The limited power of sharing the same single namespace (the Gencodes) did not evolve. The reasons are not complex and are the same as HTML: the namespace represents a local application context. When shared for all types, it limits the expressiveness needed to document multi-context real time events. o "..thousands loved it." Conceded. SGML was an expensive system deployed on then mostly mainframe and mini environments. Who had it? Aerospace tech writers, some artists, and lawyers. Why? They had a use for it and the costs were justified relative to the cost of the lifecycle of the information in its topical context. Manuals. Expensive ones. SGML lends itself to interpreted means and interpreted means are inefficient. That is relative to resources. As soon as SGML was moved to PC-based systems, it became cost-effective. There are and were examples of SGML-based systems working well for hypertext client applications in those environments. Except for lowlyIADS, mostly expensive ones. Systems like IADS proved SGML, if deFanged a bit, could be deployed cheaply. Free even. IADS did not use a DTD. It used a stylesheet (circa 1990). It had a DTD, and the tags within it were modifiable and extensible via the stylesheet processor. Its tags (file, frame, hyperlink) were the equivalent of the ThenMalignedAndDespised PROCESSING INSTRUCTIONS but they looked like tags, so DTDs written for the system incorporated them and went on about their business. Framing worked. In 1989: 1. Software was expensive 2. Hardware was expensive 3. The dominant application of SGML (1000dpi print) was hard. SGML emerged into more general use when more power was on more desks. Complexity coupled to complexity produces emergence. TCO. The critical innovation to enable the emergence of SGML came from Intel, et al. The unification of a significantly sized software base by a dominant operating system company did the rest. Kick MS as much as people want to, without them, the Web today would still be something university students surfed and researchers occasionally mastered, IMNSHO. HTML emerged when: o The Internet was opened to commercial use o The power of the processor could support the lowest-common denominator application of SGML o Governments paid to implement and give away a means and process to share the namespace in that application o A person to lead the effort emerged with a plan that would work: Tim Berners-Lee, HTTP and HTML. These convergent events, all in the same five years, gave you the WorldWideWeb. o HTML is a subset of SGML: NYET. Get out the ruler and rap the knuckles. XML is a subset of SGML. HTML is an *application* of SGML. It is obnoxious, and I apologize in advance, but getting others to understand **that** critical difference in thinking about markup is very hard sometimes. Where I put "application", some say, "vocabulary". Que bueno, but as Charles said, "conserve names" and that is all. Systems are invented or specified. Vocabularies are spoken. HTML was not hobbled. It was distilled like other vocabularies from agreements made among organizations to share information. CERN, Univ of Ill, DARPA agree to make such agreements and vocabularies are the result of that agreement. What the organizations share are namespaces and the implementations of processors for creating, adding, deleting, or modifying statements in those namespaces. HTML was GenCode: partDeux. TimBL gets the credit, but there were those who helped him and if you ask, I'm sure he will tell you names. Names are what is shared. It's all about names. Read the XML 1.0 and, IMHO, that is the conceptual breakthrough to understand markup. In essence, SGML has always been principally a lexical standard. That structural integrity is important, and specifying that provides the necessary freedom from implementation to enable an inexhaustible range of expression. It makes the agreement needed to implement a system to use it very expensive. XML locks down the SGML Declaration. Most of the biggest changes from SGML start there. To keep the original expressive power, the means for making beyondLex agreements are still needed. A DTD is not about lexical validation only. It is about validating a hierarchical namespace to determine conformance. Whether you use DTDs, MS Schemas, XML Schemas(someday), or just use the table design window for Access or Oracle, validating a vocabulary requires you to declare one or derive it. IMHO, of the two means, declaration is usually cheaper, but it is always political. Politics are human means to declare namespaces. BizTalk and OASIS both exist because of the names and interest of those named in the shared politics of creating their shared namespaces. That is all. XML does not care. Syntax unification is not enough. Using markup systems requires you to accept the idea that the namespace is primary. What does that mean? Just as sql systems must disambiguate aggregate naming, so must markup systems. A name means what you need it to. It must be unique and persistent to be a name and you require a means to discover if it is meeting that need. Trust but verify. Schemas are just one of the tools for discovering if that is the case. You can do more with schema information in the same way the relational system does it. Names are associated to create processable unique names. You can do a lot with the DTDs and schemas, really. They are just metainformation by which you agree to organize the screen and the objects on it, or the messages among objects, or whatever you want to talk about. The reason to use them is to validate or as a source for initialization. In effect, they really are, just another database of names and values. That is what makes using XML Schemas (in deference to DTDs), attractive. Application outside very specialize ISO 8879-conforming processors for DTDs are also useful for managing the namespace of that metainformation. DTDs do not aggregate; so, if instances do, they are not validatible. That does not keep them from being useful. The names in the space are unique. Their persistence is questionable, yet if you treat them as a relational designer treats a view, they are very useful. Well-formed is what you need for any lifecycle of the information. Valid is what you need to ensure correct processes among systems that use the information at particular times. When a formal means to persist these better is provided, then we have a very good system for maintaining namespace communities. Schemas organize a namespace; not doing that is relaxing a design constraint on the namespace. Relaxing that constraint is efficient particularly at this time when database systems are so cheap and ubiquitous, using them for serving strings is optimal. Correct- by-construction from a trusted source is faster, more compact, and less-restricting on system evolution. Badly-formed HTML? It was a trade-off. It cleans up over time. Better tools, better hunts, better times. All XML says is, you don't have to use the DTD. It doesn't say it isn't useful. Enlightened XMLers write them and use them and even throw them away. A DTD is snapshot of the organization of a namespace in time. Time moves on. Information does too. The DTD might not. Some part of it probably will and will influence the next version. The reason to use or not use a DTD or any other schema is determined by the namespace evolution: and evolution of agreements, so cooperation. Cooperation among large human communities is always furthered when agreements about what to name the names are simple and easy to verify. When the means to communicate among companies became the Web, the need to verify these agreements by simple means became an ecological imperative. So, patience. But don't quit pleading. Namespaces are gardens. To grow usefully, they have to be tended. It takes tools, lots of them, for particular purposes, to do that. Most of us have sheds full of tools we only use occasionally next to ones we use every day. That golden 10% of XML is the distilled essence of SGML and the years of practice and competing, sometimes awkward specifications and standards written there by all of the people I met in those years. Even those HyTime guys worked on creating XML. HyTime, DSSSL, TEI, but before them, Dexter, FRESS, Englebart, all feed the single stream that is now XML and as with SGML, all the competing, sometimes awkward specifications being written by many of the same people. If you want to plead for schemas, I plead with you. Schemas are a tool for validating agreements among overlapping namespace communities. Ecom-ecologies (keiretsu) emerge because the tools they use to make agreements, their namespaces, become efficient. S=KlogW - Boltzman. To control the temperature, control the value of W. DTDs help you control the rate at which entropy consumes referents. The trick to fix the web is to fix the web's indexes. To do that, ensure the agreements by which the indexes are made enable validation of the namespaces indexed. Well-formed, and valid by agreement are the keys to creating semantic space, overlapping vocabularies, if that is what you want. DTDs are a tool to make agreements. Beyond the agreement are the names that agree. XML Doesn't Care. You do. You write: Dilution of the basic principles of generic markup, and misunderstanding of their purpose, will then give rise to inevitable disappointment, and hence rejection: "We switched our whole company over to XML and we still can't interchange data effortlessly. So this means that XML doesn't work, right?" How many 'MLers here want a dollar for every time you've heard that? Tell 'em, "ahh, XML Works. We just don't agree on how." len bullard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@i... the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format