[Home] [By Thread] [By Date] [Recent Entries]
Thomas B. Passin wrote: > Several. Remember that in the (relational, at least) world, we have logical > and physical (and maybe conceptual too) data models. >... > What plays an analogous role in xml data model approaches? In xml data > modeling, people tend to dive right in with instances and physical schemas. > Maybe this isn't the best approach. We had the exact same distinction between logical and physical design in the XML world, and certainly in the SGML world: element type content models, and to some extent, attributes, described logical structure and entities (in the XML sense of the word: a named unit of storage defined in a DTD by an entity declaration) described physical structure. For a large, complex document, you didn't keep it in one big file, but broke it down into pieces based on which pieces had to be shared by other documents, which did/didn't need to be updated at whatever frequency, and of course, on operating system and processing program efficiency considerations. It seems like people don't do this as much anymore. I see two reasons for this: first, when more information was distributed via CD ROM, we were dealing with bigger files, so the best way to break them down was a bigger issue. Now that we're dealing with files that get sent over the Internet and are typically much smaller than a meg, it's not as necessary. The second reason was that a lot of people just didn't like entity declarations. One of the complaints about DTDs that inspired some of the schema proposals was that DTDs defined logical and physical structure in the same place, which is not the cleanest way to describe a complex system. Unfortunately, the most common solution to this problem was to just blow off physical design issues when giving developers a way to design their document type; fortunately, XInclude is now in Last Call status, so these physical issues can still be addressed from a schema-based system. > For relational databases, we have various degrees of normalization and we > know that the logical data model should be in at least 3rd normal form. > That's another metric. Normal forms are about making different data items > and structures orthogonal to each other and about reducing redundancy. It > would be interesting and valuable to look at xml data structures to find out > how to achieve comparable goals. This all works great for information that fits well into tables, but for XML it usually only works for data that started off in tables to begin with (e.g. FpML data). For information that doesn't, it's a problem--if you did it with DocBook, you'd have a separate table for all your emphasized words and phrases. Because of the lack of a series of straightforward normalization steps, DTD and schema design has been considered a black art much like OO design. In fact, the tools of OO design have helped out here; Addison-Wesley's series of UML books has a new one titled "Modeling XML Applications with UML: Practical e-Business Applications." > All these things (and more, including coherence) go into making a good ER > data model. They are all independent of the processing algorithms. Mr. ER himself, Dr. Peter Chen, is actually on the XML Schema Working Group and has been giving a talk at the last few XML DevCons about ER's affinity for XML. The main thrust of his talk addressed the value of the "R" in "ER modeling" and how ER modeling provided a good formal basis for designing information structures using RDF or XLink, where modeling of relationships is so important. Bob DuCharme www.snee.com/bob <bob@ snee.com> see http://www.snee.com/bob/xsltquickly for info on upcoming "XSLT Quickly" from Manning Publications.
|

Cart



