[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Designing XML to Support Information Evolution
Welcome back Roger, Ah, a posting about vineyards...just in time for my vacation in Naples Italy next week. ;) This makes a lot of sense to me - please see comments below (far down). Kind Regards, Joe Chiusano Booz | Allen | Hamilton Strategy and Technology Consultants to the World > "Roger L. Costello" wrote: > > Hi Folks, > > For the past 4 months I have been working on a demo of a Vineyard in > which Pickers move around, harvest ripe grapes, eat, and even die. In > the process of building this demo I have learned some things regarding > XML design, which I would like to share. > > When I first started working on the demo I thought that the best way > to design the XML for the Vineyard/Picker system was a "classical" > highly structured, hierarchical design. In fact, this turned out to > be the worst approach. It was rigid, it made processing the > information (e.g., moving the Pickers around, harvesting ripe grapes, > eating, death) horribly complex, and I wanted to be able to process > the Vineyard lots in a parallel fashion, which this design totally > prohibited. Here was my first design: > > <vineyard> > <tract num="1"> > <lot num="1"> > ... > </lot> > <lot num="2"> > ... > <picker id="36"> <!-- Picker #36 on lot #2, tract #1 --> > ... > </picker> > </lot> > ... > <lot num="50">...</lot> > </tract> > ... > <tract num="50">...</tract> > </vineyard> > > As you can see, this design is classical structured data: > - the vineyard is comprised of multiple tracts > - each tract is comprised of multiple lots > - a lot may contain a picker > > Several thousand lines of XSLT code later I decided it was time to > dump this design. > > My next design "flattened" things out a bit. I put the Pickers > physically after the tracts, and each Picker referenced the tract/lot > that they resided upon using a couple of "ref attributes". This made > "moving" the Pickers easy - simply adjust the references. Here was my > second design: > > <vineyard> > <tract num="1">...</tract> > <tract num="2">...</tract> > ... > <tract num="50">...</tract> > <picker id="1"> > <location tract-ref="13" lot-ref="48"/> > ... > </picker> > ... > <picker id="400"> > <location tract-ref="21" lot-ref="4"/> > ... > </picker> > </vineyard> > > With this design my XSLT code dropped from several thousand lines to > about a thousand lines. However, this design was still too rigid, and > made parallel processing of the lots impossible. > > Here is the design that I finally arrived at. It is extremely > flexible, amenable to parallel processing, and the code to manipulate > it is very simple (a couple hundred lines of simple XSLT code). > > <vineyard> > <lot tract-num="23" lot-num="5">...</lot> > <picker id="36"> > <location tract-ref="12" lot-ref="29"/> > ... > </picker> > <lot tract-num="3" lot-num="24">...</lot> > ... > <lot tract-num="1" lot-num="49">...</lot> > </vineyard> > > The lots have 2 attributes to identify their location. > Each picker has a location element that has 2 attributes to identify > the lot it resides on. > > Notice that it is an extremely flat structure: > - a vineyard is comprised of lots and pickers (no more <tract> > elements) Of course, this works in this case because the information for tracts is nothing more than a number. If there were additional properties for tracts, then you would need a separate set of <tract> elements with subelements representing their properties. > Notice that it is an extremely flexible structure: > - the order of the lots and pickers is irrelevant > > With this design I can now process each lot on the vineyard in > parallel. The other designs forced a sequential processing. > > Here are some lessons I learned. I believe these lessons apply to all > XML information structures where you have a requirement to evolve the > information structure by moving the information (e.g., move the Picker > around to different lots), changing the information values (e.g., a > Pickers harvests ripe grapes, thereby decreasing the value of > <ripe-grapes> on a lot), and where parallel processing of the > information is desired/needed. I don't know if these lessons apply > everywhere. > > 1. How you structure your information in XML has a tremendous impact > on the processing of the information. > > 2. Hierarchy makes processing information hard! There exists a > relationship between hierarchy of information and the complexity of > code to process the information. The relationship is roughly: the > greater the hierarchy, the greater the complexity of code to process > the information (Some hierarchy is good, of course. But the amount > of hierarchy that is good is probably much less than one might > imagine, certainly less than I thought, as described above.) > > 3. Flat data is good data! Flatten out the hierarchy of your data. > It makes the information flexible and easier to process. > > 4. Order hurts! Requiring a strict order of the information makes for > a brittle design. It is only when I allowed the lots and pickers to > occur in any order that the flexibility and simplicity kicked in. Not to start up a permathread, but of course sometimes order is necessary. Kind Regards, Joe Chiusano Booz | Allen | Hamilton Strategy and Technology Consultants to the World > Comments? /Roger -- Kind Regards, Joseph Chiusano Associate Booz | Allen | Hamilton
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|