Re: Class hierarchies in XML
At 09:27 PM 7/2/97 -0400, Giovanni Flammia wrote: >We really need to build an object-oriented hierarchy, with classes that >are extended >by subclasses and so on...For example, a <restaurant> is a subclass of ><location> and >inherits the properties of <location> such as <address> and <street >number>, but >adds other properties, such as <menu>. >What is the proper syntax for expressing classes and sub-classes, or >types and subtypes, >inheritance and so on? i.e how do I tell in a document that <restaurant> >is a subclass of <location> (and perhaps allow even multiple >inheritance?) The architectural approach, using the formalism and syntax defined by the Architectural Forms Definition Requirements (AFDR) annex of HyTime (2d Edition), works as follows: 1. You define some set of superclasses. This definition consists of two essential parts: a set of SGML element and attribute declarations (i.e., a "DTD") and some documentation of the semantics of these classes. This serves to define a set of semantics, give them names (the element types and attributes), and give the whole set a name (the public ID or URN of the superclass set, declared as a notation). These definitions are first and foremost *documentation*. However, the declarations can be used to do validation of documents against the architecture, if desired (the SP parser supports this, for example). They may also suggest the design of object-oriented programs that provide "methods" for the element classes. For example, the following set of declarations declares an architecture for describing "locations": <!-- Declarations and documentation for the "location" architecture. Refer to this architecture with the public ID "-//G. Flammia//NOTATION Location Architecture//EN". --> <!ELEMENT location -- A place (building, venue, etc) people go -- -- A location must have a name and address. It may have additional descriptive properties appropriate for the type of place (e.g., a Restaurant may have a menu property) -- - - (name, address, loc-descriptor*) > <!ATTLIST location ID ID #IMPLIED -- Unique ID of the location, to enable linking -- > <!ELEMENT name -- A descriptive name for a location -- - - (#PCDATA) > <!ELEMENT address -- The address of a place -- - - (address-item+ | address-block) > <!ELEMENT address-item -- A component of an address (e.g., street, city, ...) -- - - (#PCDATA) > <!ELEMENT address-block -- An unstructured address -- - - (#PCDATA) > <!ELEMENT loc-descriptor -- A descriptor for a location -- -- Contains additional decriptive information for a location -- - - (#PCDATA | loc-bridge)* > <!ELEMENT loc-bridge -- "Architectural bridging element". Generic, semantic-less structure (e.g., Paragraph). -- - - (#PCDATA | loc-bridge)* > <!-- End of Location architecture --> This set of declarations has defined a very general set of superclasses, defining and documenting the minimum requirements for describing locations. Note that these are *minimum* requirements--you can add additional sophistications when you specialize from this general architecture. The loc-descriptor and loc-bridge element forms are intended to be specialized for different kinds of locations. I can now define a "Restaurant Architecture", derived from the Location architecture, that adds specialized elements unique to (or needed for) restaurants. This is also defining a set of superclasses, derived from the location superclasses, but intended to be specialized for individual documents. Again, the primary purpose of the following is to formally declare and document the classes and their semantics. <!-- Restaurant description architecture. Derived from the location architecture. Refer to this architecture with the public id "-//Eliot Kimber//NOTATION Restaurant Architecture//EN" --> <!-- Declare names of superclass set this set of declarations is derived from: --> <!IS10744 ArcBase location > <!NOTATION location PUBLIC "-//G. Flammia//NOTATION Location Architecture//EN" -- Pointer to superclass "location" architecture --> <!ELEMENT Restaurant -- Describes a restaurant -- - - (Name, Address, Menu, Hours, Cost) > <!ATTLIST Restaurant ID ID #IMPLIED location NAME #FIXED "location" -- Define derivation of class "restaurant" from superclass "location" -- -- Attribute name "location" is name of architecture (coincidently the same as the key class in the architecture in this case). -- > <!ELEMENT Name -- A descriptive name for a location -- - - (#PCDATA) > <!ELEMENT address -- The address of a restaurant -- - - (street, city, state, zip, phone) > <!ELEMENT (street, city, state, zip, phone) -- Parts of an address -- - - (#PCDATA) > <!ATTLIST (street, city, state, zip, phone) location NAME #FIXED "address-item" > <!ELEMENT Menu -- The menu for a restaurant -- - - (Menu-item+) > <!ATTLIST Menu location NAME #FIXED "loc-descriptor" > <!ELEMENT Menu-item -- An item on the menu -- - - (#PCDATA) > <!ATTLIST Menu-item location NAME #FIXED "loc-bridge" > <!ELEMENT (Hours, Cost) - - (#PCDATA) > <!ATTLIST (Hours, Cost) location NAME #FIXED "loc-descriptor" > <!-- End of Restaurant architecture declarations --> Here's how you relate the restaurant declarations to the location declarations: 1. Any element type in Restaurant that has the same name as one in the location architecture is automatically derived from the location form (e.g., "name") 2. The "location" attribute defines the mapping for all other element types In this case, every element type in the Restaurant architecture is derived from a superclass form in the location architecture, but that's not a necessary requirement. In addition, any subclass architecture or document can be derived from multiple superclass architectures. These two architecture declarations define a class hierachy. The syntax and declarations are formal enough to enable processing and validation of documents against these declarations. However, their first and foremost purpose is as *documentation* for humans to read and understand. Now I want to create a document that describes a restaurant. This document will be derived from the Restaurant architecture. In an XML environment, if we assume that there are no declarations for the document, then the restaurant architecture defines the rules for documents, but, because it's not used as the real DTD declarations, needn't be processed in order to parse the document. (But note that the restaurant architectural declarations *could* be used as a document's DTD declarations if desired, because the syntax is the same.) Here's a restaurant document derived exactly from the restaurant architecture: <?XML 1.0?> <!DOCTYPE Restaurant SYSTEM "" [ <?IS10744 ArcBase restaurant> <!NOTATION restaurant PUBLIC "-//Eliot Kimber//NOTATION Restaurant Architecture//EN"> ]> <restaurant> <name>Kreiz' Barbeque</name> <address> <street>Off the square</street> <city>Lockhart</city> <state>Texas</state> <zip>787xx</zip> <phone>512-555-1234</phone> </address> <menu> <menu-item>Brisket</menu-item> <menu-item>Prime rib</menu-item> <menu-item>Pork chops</menu-item> </menu> <hours>8 to 8, closed Sunday</hours> <cost>Moderate</cost> </restaurant> Note that the DTD is null (SYSTEM ""), but the notation declaration connects the document with the architecture. Thus a human observer or parser *can* refer to the architecture if desired, but isn't required to. I can use the restaurant architecture as part of a larger document type (say a document type for travel info). I can also specialize from it at the document level. For example, I might have something like this: <?XML 1.0?> <!DOCTYPE CityGuide SYSTEM "" [ <?IS10744 ArcBase restaurant location> <!NOTATION restaurant PUBLIC "-//Eliot Kimber//NOTATION Restaurant Architecture//EN"> <!NOTATION location PUBLIC "-//G. Flammia//NOTATION Location Architecture//EN"> ]> <cityguide> <title>A Guide to Austin And Environs</title> <places-to-eat> <para>Austin is known for its barbeque, traditionally smoked over hickory or mesquite and served dry or with spicy sauce</para> <bbq-joint restaurant="restaurant"> <name>Kreiz' Barbeque</name> ... </bbq-joint> <nuevo-cuisine restaurant="restaurant"> <name>Coyote Cafe</name> ... </nuevo-cuisine> </places-to-eat> <places-to-hear-music> <bar location="location"> ... </bar> </places-to-hear-music> </cityguide> Here I've done two things: 1. I've specialized from restaurant to further distinguish types of places to each. 2. I've derived the document from two different architectures (restaurant and location). >Can you point me to the relevant specs? The AFDR Annex of HyTime can be found at "http://www.ornl.gov/sgml/wg8/hytime/html/clause-A.3.html" (in a few days--we're setting up the site now). The minimum you need know in order to make the above work with SP can be found at "http://www.jclark.com". The key difference between what I've shown above and the mechanism defined by the AFDR is the use of notation attributes to further configure the use of architectures in documents and meta-DTDs (architecture declaration sets). As XML doesn't [yet] have notation attributes, there's no way to use that aspect of the AFDR. However, you can approximate it as I've shown above. Note that the "inheritance" is largely conceptual--this is data, not programming--so its up to the authors of documents to understand the semantics of the class hierarchies and use them appropriately. The declarations enable some validation against the architectures, but it's ultimately up to humans or down-stream processors to validate the use. Note also that the "methods" associated with elements *are* programs (browser objects, style sheet functions, transforms, etc), and so may do real inheritance. As mentioned before, it probably makes sense in general to design object-oriented processors that mirror the architecture classes. If anyone wants to see how the above documents and architectures can be processed architecturally using SP, I'll work up the examples when I get a chance (after the holiday). I'm also preparing a more complete paper on similar uses of architectures which I'll announce once I've got it up on the ISOGEN Web site. Cheers, Eliot -- <Address HyTime=bibloc> W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com </Address> xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@i... the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@i...)
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format