Re: 3 possible approaches for representing concepts
[Chiusano Joseph] > I would like to please solicit some quick feedback if possible regarding > an approach to using elements and/or attributes to represent concepts in > an XML document. I am having a "healthy debate" with a "colleague" on > how much "meaning" should be placed into an element name, and how much > (if any) should be "filled out" by attributes. > > Below I've identified 3 approaches for representing a concept called > (pipes separate the "subconcepts"): > > CurrentYear|Budget|Final|Estimated|Amount > > A "related" element in the same XML document might be called (note only > the first subconcept has been changed): > > PriorYear|Budget|Final|Estimated|Amount > Joe, I do not see enough information in your description to really get down to it. That is because I do not know what other uses you may have for some of these concepts. How separable do they have to be? How many similar strucures will there be, having only minor variations (like prior year vs current year) If you are asking about attributes vs elements, I think that is not of much importance in itself. You can tell that by asking whether you could transform the one approach into the other (e.g., by using a stylesheet). If you can, they are basically isomorphic and who cares (religous opinions aside!)? If you cannot, there is nothing to choose between, since you have to go with the form that works for your data. In relational database practice, naming data elements properly can be of major importance, because most of the semantics tend to be conveyed by the names. It sounds like that is the case here too. What is more important than elements vs attributes is the reusability of processing, and the ability to know when two structures are essentially the same kind of thing. With this in mind, your example # 3 (when the typos are fixed, of course) is probably closer to the mark, but maybe not there yet either. I think that the parent element names should be the same where ever possible, and you should use attributes for differentiating. The reason is that I find that doing special cases on element names is the most clumsy and least pleasant way to process xml data - and I have some applications where I do that. So I would rather see this: <yearlyBudget type='current-year' status='final'>2999</yearlyBudget> <yearlyBudget type='current-year' status='estimated'>2999</yearlyBudget> <yearlyBudget type='2002' status='estimated'>2999</yearlyBudget> than this: <finalCurrentYearlyBudget>2999</finalCurrentYearlyBudget> <estimatedCurrentYearlyBudget>2999</estimatedCurrentYearlyBudget> <estimatedPreviousYearlyBudget>2999</estimatedPreviousYearlyBudget> In the first case, it is easy to do the same processing for each element and stick it into the right place, as determined by the attribute value. In the second place, you have to know that the same processing needs to be done on all three, AND you have to know the right place to put the results. That requires more design, more processing, and may lead to more maintenance problems later on. However, if I would only ever have that one type of budget, I would prefer the first form. But I bet that you will have several types of budget and would want to reuse the structure and the processing. My first example is more reusable in that sense. Also, if all your element structures are going to be flat with single data elements, so that you basically have a simple relational table, it makes much less difference. The principles I mention above are more significant when the xml structure is more complex, with a number of nested children, so that the processing is more complex. However, if you know there will only be a smallish number of variations and that they will be relatively stable, using different element names would not be too bad since it would be reasonably easy to look for them all. Bottom line - prefer to use values rather than names to denote variations. Choose names based on 1) The desired semantics and data model. 2) The expected number of similar variations where you would want to do the same processing for the variations - including the amount of changes and extensions that you are pretty sure will be coming. Do not spend time choosing among essentially isomorphic alternatives, like elements vs attributes. Since you cannot have repeated attributes in an element, a need for repeated data elements would be a non-isomorphic situation. Cheers, Tom P
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format