Re: Namespaces, Architectural Forms, and Sub-Documents
David Megginson wrote: > > XML documents may (and perhaps, usually will) contain non-XML objects > such as wordprocessor documents, spreadsheets, MPEG clips, Java > applets, audio sequences, and many others -- to date, thankfully, no > one has proposed uuencoding any these and dumping them inline between > a start and and tag. Maybe not on this mailing list, but come on over to "SGML-TOOLS" (formerly LinuxDoc). :) :) > Why should we treat an equation marked up in XML differently than an > equation marked up in Microsoft Word? It seems easier (from a user's > perspective) to treat everything as objects, rather than defining one > special case. We should treat them differently for two reasons: #1. XML data is text, and thus makes a certain amount of "sense" inline. If I embedded LaTeX in an XML document I would probably inline it, rather than refer to it for the same reason. Word formuale are binary. #2. XML has concepts such as validation and id-reference that depend on data being logically inline. #3. If we do not do this, I do not think that people will use subdocs. They will probably just abandon validation or use XML-Data. > Object-oriented programming has proven the value of > encapsulation, and the compound-document idiom is standard on millions > of desktops already, so we can hardly argue that subdocuments are an > unfamiliar approach. Not so. Word does not use externally embedded data by default. If you create a table, formula or a graphic, it is inlined by default. Typically you only externally link to a file if it already exists (e.g. it has some meaning independent of this document). I think Microsoft made the right choice there. > I am a big fan of pragmatism on the implementation side, as people > might have noticed from my postings on the design of AElfred; on the > standards side, though, I wouldn't want to cripple a spec just to work > around a temporary problem that will have to be solved anyway for > non-XML objects. SGML is 12 years old. We are only marginally closer to having decent tools that will manage this stuff for us. I personally have no faith that they will arrive soon. I also think that we have 10 years of good experience with what we need to guide our choices. Most major DTDs incorporate ad hoc DTD modularity features. We know what they need to make these features robust -- just namespace protection. > SGML people will remember unfortunate features like > SHORTREF, DATATAG, and OMITTAG -- included a little over a decade ago, > likewise, for the sake of making things easy and working around > temporary deficiencies in the available tools. Well, I still use two of those three features, so obviously the problems with the tools have not sufficiently cleared up yet. It also isn't clear to me if those features have helped or hurt SGML's propularity. OMITTAG in particular is very widely used. Even HTML uses it. > > * element type constrainability (how do I specify a SUBDOC root element > > type in a content model?) > > Use HyTime (just joking). Seriously, I cannot see that this is a > worse case than not being able to use a DTD at all. It isn't. But in XML we do have DTDs and we want to use them for these heterogenous (not "compound") document. > The general idea > of compound documents (Netscape with plug-ins, OLE documents, Andrew > documents, or otherwise) is that you can plug in any object -- I had > imagined that this was the goal of namespaces as well. I don't think so. In my paper I quoted from the XML Namespaces spec: "We envision applications of XML in which a document instance may contain markup defined in multiple schemas. These schemas may have been authored independently. One motivation for this is that writing good schemas is hard, so it is beneficial to reuse parts from existing, well-designed schemas. Another is the advantage of allowing search engines or other tools to operate over a range of documents that vary in many respects but use common names for common element types. " The goal of combining schemas is central to the concept. > In XML you can > constrain the placement of pointers to external objects, at least. Cold comfort. :) > > * "content model communication" (how do I pass a %cell; content model > > into my table subdoc) > > You're thinking of CALS here. I'd suggest that we move away from the > older SGML model of heavily parameterised DTDs (as from heavily > #IFDEF'ed C header files): remember that one of the arguments for the > namespace model is to reuse stylesheets and other processing > specifications -- if a table model can vary its content unpredictably, > then you will not be able to reuse stylesheets anyway. The formatting for the contents of table cells and for the shape of the table can be specified independently. In HTML, (for example) essentially anything can go in a table cell. The table formatter just figures it out. A good stylesheet language will provide quite a bit of independence between construction rules. Yes, we may need some conventions for more complex combinations (e.g. metadata formatting conventions), but most things will "just work." > > * ID linkage (even for simple links I must use some more advanced > > linking strategy) > > HREFs would work fine -- HTML people are already used to > > <a href="book.html#chapter3"> > > so we should have no confusion here. > > * semantics (i.e. SUBDOC has none...you need VALUEREF or something else > > on top of subdoc) > > I expect that XLL will provide mechanisms for expressing the 'embed' > semantic. Both of these proposals just add hassles to something that should be simple. > Furthermore, you have the > advantage that your document's validity does not depend on its child > objects (this is very important for document management in large, > multi-author systems -- if subdocuments are atomic, then a change by > one author to a table, for example, will not make the containing > chapter invalid). Again, as in programming, encapsulation will be a > big win in the medium term. Yes, there are occasions where this encapsulation is important and useful. There are also times where it is not. Let me put it this way: do you feel that the creators of DocBook, TEI and HTML were mistaken by including table models rather than forcing their users to use subdocs? If yes, then you have a very different idea of usable DTD design than I do. If no, then I cannot understand why you are opposed to making this process of including table models easier so that you do not need people with brains the size of planets and a serious commitment to DTD use to accomplish it. All I am asking is to make this common DTD fragment combination idiom simpler, more standard and more robust so that casual (and expert!) users can whip up their own DTDs by combining fragments instead of manually merging fragments, disambiguating names, adding architectural forms etc. etc. Paul Prescod -- http://itrc.uwaterloo.ca/~papresco xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format