[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] The Power of Groves
Strictly speaking, I suppose that groves are somewhat off-topic for this group, but since there is already a lot of discussion about them here, and since the small group of people who have a lot of understanding of and experience with groves seems to be well represented here, I'll give it a shot: I was rereading some old material on groves, and came across the following in a post by Eliot Kimber to comp.text.sgml (it was at the end of a paragraph discussing the definition of customized property sets for various kinds of data; the full context is available at http://www.oasis-open.org/cover/grovesKimber1.html): "However, there is no guarantee that the property set and grove mechanism is capable of expressing all aspects of any notation other than SGML." (Notes 440 and 442 in section A.4 of the HyTime spec say much the same thing.) On the face of it, this is a perfectly sensible thing to say. At the same time, however, it is rather disturbing, because it suggests that there might exist data sets for which the grove paradigm is wholly unsuited. I would certainly hate to expend a lot of effort building a grove-based data model for a data set, only to discover part way through that groves and property sets simply won't work for that data set. In the world of computing, we can rest easy knowing that there exists, at least conceptually, a Universal Turing Machine, and that such a machine, given an appropriate program, is capable of computing anything that is computable. So the first question is this: 1) Does a Universal Data Abstraction exist? Note that, like a Universal Turing Machine, such an abstraction need not be particularly efficient or otherwise well suited to any specific task. The only requirement is that it be universal in the sense of being capable of representing any conceivable data set (or at least any "reasonable" data set). (And no, I don't have a formal definition of what "reasonable" would mean in this context; all I can say is that the definition itself should be reasonable....) The real importance of a Universal Data Abstraction is that it would provide a formal basis for the construction of one or more Practical Data Abstractions. Assuming that the answer is "yes" (and I have no real justification other than optimism to believe that it is), the second question follows immediately: 2) Does the grove paradigm, or something similar to the grove paradigm, constitute a Universal Data Abstraction? If one is feeling contrary, it would be easy to answer "no" to the second question by providing an example that answers the third question in the affirmative: 3) Does there exist any "reasonable" data set for which the grove paradigm inherently cannot provide an adequate representation? When attempting to answer this third question, it is important to avoid getting caught up in unwarranted toplogical arguments. The topology of groves may not map onto the topology of a particular data set, but that does not mean that that data set is unrepresentable as a grove. Consider XML: An XML document consists of a linear, ordered list of Unicode characters, yet the XML format is quite capable of representing any arbitrary directed acyclic graph. ======== On a somewhat related note, I've noticed that in discussions regarding the Power of Groves, the arguments by the proponents seem to fall into two distinct groups. On the one hand, some people see groves as being quite universal in their applicability. On the other, some people talk about groves almost exclusively within the context of SGML, DSSSL and/or HyTime. As an outsider and relative latecomer to the party, I find it difficult to determine whether this dichotomy of viewpoints is real, or merely reflects the differences in the contexts in which the discussions have taken place. If the schism _is_ real, it would be helpful if those sitting on either side of the fence could add their thoughts regarding why the schism is there, and why the people on the other side are wrong. :) An example of why I am concerned by this question is given by the property set definition requirements in section A.4 of HyTime. The definition of property sets is given explicitly in terms of SGML. That is, a property set definition _is_ an SGML document. But it seems to me that if property sets have any sort of widespread applicability outside of SGML, then a property set definition in UML or IDL or some other notation would serve just as well (assuming that those other notations are sufficiently expressive; I'm fairly confident that UML is, but I'm not so sure about IDL). Of course, it can be argued that _some_ notation had to be used, so why not SGML? My response to that is that I believe that the mathematical approach of starting with a few extremely basic axioms and building on those as required to develop a relevant "language" for expressing a model would be far superior, as it would allow people to fully visualize the construction of the property set data model (or "metamodel," if you prefer), without getting bogged down in arcane SGML jargon. After all, SGML can hardly be described as minimalist. (An aside: I believe that a lot of the resistance to acceptance of SGML and HyTime has its basis in the limitation of identifiers to eight characters, leading to such incomprehensible abominations as "rflocspn" and "nmndlist." Learning a completely new body of ideas is hard enough without having to simultaneously learn a foreign--not to mention utterly unpronounceable--language.) This situation with property set definitions reminds me of the recent discussions in this group regarding the chicken-and-egg relationship between the XML notation and the XML data model. The absence of a pre-existing data model for XML leads to a scenario in which everyone who uses XML builds their own mutually-slightly-incompatible data models. While I can't prove that the same has happened or will happen with property set definitions and other related aspects of the grove paradigm, I think such a thing is certainly plausible. As with XML, the question boils down to what is more fundamental, the notation or the data model. -Steve Schafer
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|