|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: When to use attributes vs. elements
On Fri, 5 Feb 1999, Andrew Layman wrote: > Thank you. Dan asks a reasonable question, which is whether a document that > uses the conventions described in > http://www.w3.org/TandS/QL/QL98/pp/microsoft-serializing.html needs to > signal somehow that these conventions are in play. > > In case of the "canonical format" I proposed, however, I don't think special > signalling is necessary: The proposal does not add any new interpretations > to the use of elements or attributes beyond what can be described in a DTD > or a schema such as XML-Data or DCD. Elements, attributes, ids and idrefs > are carefully used so that their normal XML interpretation matches the > scoping and linking rules of object graphs or relational databases. So, to be clear on what you're claiming... For any chunk of 'normal' XML, you have a set of interpretation rules that tell us how all the attributes and elements map into "graphs of data such as database tables and relations, nodes and edges from directed labeled graphs, and similar constructions"[1]. This would be enormously useful, if people could be persuaded it were true. > In a general case, if conventions add rules for interpretation above what is > in the structure of a document or above what can be expressed in a DTD, then > this would need to be somehow signalled in order for a reader to process the > document. I'm a little confused in that [1] proposes a canonical framework for interpreting all XML as graph serialisations, but then goes on to discuss "Mapping Abbreviated Syntax to Canonical Syntax": However, the canonical syntax is not the only syntax that could be used to serialize a graph. In many cases, alternative syntaxes may be used, either due to historical or political factors, or to take advantage of compressions that are available if one has domain knowledge. We call all of these "abbreviated syntaxes."[1] This implies that some unknown subset of XML instance data will have been serialised according to one or more alternate serialisation algorithms. Consequently de-serialising such data according to the 'canonical' algorithm will garble your data. In which case we're back in a situation where we need a mechanisms such as <XYZ:SerializationAccordingToAndrew> to tell us which data can be interpreted according to the 'canonical' rules versus some alternate (possibly unknown) serialisation rules. The example alternate serialisation given is: <Class> <name>Western Civilization</name> <taughtBy>Thorsten</taughtBy> <attendedBy>Raphael</attendedBy> <attendedBy>Smith</attendedBy> </Class> Interpreting this according to the "Procedure for XML Instance to Graph Conversion" rule will give garbage data. We simply don't know from looking at the XML above what nodes and edges it creates. The fact that we need to treat such data in a special manner is worrying: how are we supposed to _know_ when there is something else to know? (repeated from above) > In a general case, if conventions add rules for interpretation above what is > in the structure of a document or above what can be expressed in a DTD, then > this would need to be somehow signalled in order for a reader to process the > document. This suggests that the burden is placed upon content creators to flag up when the generic 'canonical' rule wouldn't usefully apply to the interpretation of the XML content. So the default behaviour would be to assume everyone used the rules outlined in [1] unless associated schema, stylesheet or enclosing tags told us otherwise? So... if I'm a 'canonical-format' aware processor building a graph from XML data acquired from a variety of sources, what procedure do I follow to sort XML instance data into the follow categories: a) old XML files which *happen* to have been serialised according to the canonical-format rules b) old XML files which happen *not* to have been serialised according to the canonical-format rules. (for example, the extract above) c) recent XML files created by following the c-f rules for serialising graphs d) recent XML files created using an alternative or abbreviated graph serialisation algorithm as discussed in [1] In particular, I'm concerned that (a) and (b) are mechanically indistinguishable. Dan [1] http://www.w3.org/TandS/QL/QL98/pp/microsoft-serializing.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








