[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML question - reference points
On Wed, 2004-04-07 at 00:46, Pascarella, Randy wrote: <snip> > > The point is that the two files describe the same thing from the same > schema in different ways via the values of the elements. So if I, > being an intercessor, want to link the two together in a common way, > how would I do this? I am currently thinking of something like this. > Let's say I'm a government employee wanting to provide common tags for > Bob and Tom to use in their XML files for consistency. I define a tag > called <gov.illinois.chicago.street.name> as a "reference point". Bob The problem with this approach is that you would have to define a new element for every city in the U.S., even the entire world if you want to record an address elsewhere. Your schema (DTD, RELAX NG, or whatever you use) would bloat enourmously, so much that using it would not be practical. You can avoid this in a couple of ways. Here are two: * Don't allow Tom and Bob to store inconsistent data in the files. If they _enter_ data the wrong way, your application could guide them, perhaps by verifying their entries against a database of street names. * Use an attribute to store the correct name. Instead of this: > <city> > <name>Chicago</name> > <streets> > <street> > <name>Main Street</name> > <gov.illinois.chicago.street.name>Main > St.</gov.illinois.chicago.street.name> > <zipcode>60609</zipcode> > </street> > <street> > <name>West Street</name> > <gov.illinois.chicago.street.name>West > St.</gov.illinois.chicago.street.name> > <zipcode>60603</zipcode> > </street> > ... > </streets> > </city> You would get this: <city> <name>Chicago</name> <streets> <street name="Main St."> <name>Main Street</name> <zipcode>60609</zipcode> </street> <street> <name>West Street</name> <gov.illinois.chicago.street.name>West St.</gov.illinois.chicago.street.name> <zipcode>60603</zipcode> </street> ... </streets> </city> Of course, you could also do this: <city> <name>Chicago</name> <streets> <street> <entered-name>Main Street</entered-name> <real-name>Main St.</real-name> <zipcode>60609</zipcode> </street> <street> <name>West Street</name> <gov.illinois.chicago.street.name>West St.</gov.illinois.chicago.street.name> <zipcode>60603</zipcode> </street> ... </streets> </city> Please note that we know the real-name denotes a street in chicago because of the context. Providing the context information in an element name is both redundant and impractical. Both approaches above would give Tom and Bob the flexibility you want. However, I would think twice, or thrice, before allowing it. Sooner or later you will end up with: <street name="Main St."> <name>Minor Street</name> <zipcode>60709</zipcode> </street> and now you will have a bit of a problem figuring out which street this really is, especially if 60709 is an area were there is neither a Main, nor a Minor street. Allowing duplication of data is rarely a good thing. > > The whole point is that I want to give Tom and Bob the flexibility to > use whatever values they want for the names, but have an "internal" > way to map those elements into standardized elements that anyone can What is the purpose to allowing this flexibility in entering what is in practise data in a database? Will Tom and Bob be the sole consumers of their own data? Apparently not, or there would be now reason to store a normalized version of the street names. Are other consumers interested in Tom's and Bob's writing quirks? Probably not. Are they interested in getting correct and unambiguous information? Probably yes. Building a strong case for allowing ambiguous data to be stored seems hard to do. (In this particular case. Not necessarily always.) You have too consider all of this before making a decision on implementing the schema you are considering. > > Does this make sense? Is there a better way to do this? Am I off in > the weeds? Well, I would probably go with trying to get Tom and Bob to enter correct data in the first place. A system that helps them do this does not have to be obnoxious or intrusive. Of course, one could also ask: if Tom has gone through the trouble of entering the data, why should Bob have to do the same thing? Wouldn't it be better if Bob could just reuse Tom's entry, preferably by linking to it? /Henrik
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|