[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] The Myth of Explicit Relationships [Was: 3 XML Design Principles]
Hi Folks, This past week we discussed implicit versus explicit relationships. This morning I carefully reread every message. Several people commented that "XML Doesn't Care" and "XML is just syntax". I'd like to apply those comments to the notion of explicit relationships. In this message I would like to reverse my stand and argue that it is not possible for XML markup to state explicit relationships. I will argue that explicit relationships in markup is a myth. Consider this markup: <Lot id="1"> <Picker id="John"> ... </Picker> </Lot> What's the relationship between the Lot and the Picker? The tag names have meaning to us as humans. Thus, it is tempting for us to read the markup and infer a relationship that in fact does not exist. To an XML parser the markup looks like this: <W#*jQ10x> <p99@% I8s="John"> ... </p99@%> </W#*jQ10x> That is, to an XML parser the tags are just a collection of meaningless characters. The advantage of us humans viewing the markup in this later form is that it lessens our overwhelming temptation to infer meaning. Oddly, the relationship becomes clearer! We now see that the relationship is simply: p99@% is nested within W#*jQ10x Or: Picker is nested within Lot No other statements can be made regarding the relationship between the Lot and the Picker. Thus, it is erroneous to state this relationship: The Picker is located on the Lot This "located on" relationship may not be stated. It is bringing in knowledge that is not present in the markup. (The knowledge is coming from the mind of the reader) In fact, even the "nested within" relationship is only known if the processing application happens to be an XML-aware application. A non-XML-aware application would not even be able to recognize the "nested within" relationship. I make these two assumptions for this discussion: 1. The processing application is an XML-aware application. 2. The processing application is completely ignorant of our vocabulary. Let us continue with our example. Suppose that we decide that the "nested within" relationship is not sufficiently precise for our desires. Can we design the markup to make the relationship more precise? How about this: <Lot id="1"> <locatedOn> <Picker id="John"> ... </Picker> </locatedOn> </Lot> It would appear that the <locatedOn> element is making explicit the relationship between the Lot and the Picker. It is best to resist the temptation of our mind to add knowledge to the markup. So, let us convert the example into something less interpretable by our mind (but equally interpretable to a machine): <W#*jQ10x> <vb*@34> <p99@% I8s="John"> ... </p99@%> </vb*@34> </W#*jQ10x> This makes it clear that the only relationships which can be stated are: p99@% is nested within vb*@34, which is nested within W#*jQ10x or: Picker is nested within locatedOn, which is nested within Lot Introducing the locatedOn element has done nothing to make the relationship between the Lot and the Picker more explicit. It has only served to push the Picker to a deeper nesting level (and thus more digging is needed to get at it). Any relationship information beyond "nested within" must be introduced outside the markup, i.e., by human-engineered applications that process the markup. Conclusions/Observations/Questions 1. No matter how you design your XML it won't make any difference in semantics. Specifically, nested elements will always yield the "nested within" relationship and nothing else. 2. There is no such thing as markup that enables you to specify an "explicit relationship" between components. Any relationship semantics beyond "nested within" is entirely a product of the application processing the markup. 3. XML places the whole burden of semantics squarely upon the shoulders of processing applications. Consequently, the sender and receiver must necessarily be tightly coupled (i.e., have shared semantics). 4. All message exchanges are nothing more than a series of encodings and decodings. For example, this message: "The Picker whose name is John, is currently located on Lot number 1" May be encoded like this: <Lot id="1"> <Picker id="John"> ... </Picker> </Lot> Or it may be encoded like this: <W#*jQ10x> <p99@% I8s="John"> ... </p99@%> </W#*jQ10x> Either is a perfectly fine encoding. Communication occurs when the receiver possesses the proper decoder. Thus, a decoder must be able to decode the above code back into the original message: "The Picker whose name is John, is currently located on Lot number 1" If the receiver does not have the proper decoder then communication cannot occur. 5. Is it conceivable that a code could be processed by many different decoders? 6. Are there such things are "self-decoding codes"? (I guess that a virus is an example.) 7. Is it reasonable that a generic message format (i.e., XML) should carry at least some of the burden of semantics? Should XML 2.0 possess more semantics than is currently found in XML 1.0? Comments? /Roger
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|