[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML basics
On Tue, 1 Mar 2011 08:07:49 +0000, Joe Fawcett wrote: > Thanks for your comments, can you suggest a good term for a generic building > block of XML if I can't use the term 'node'? There really isn't one that's generally agreed upon. That's true for 'node', btw. Once you've created a tree, there's relatively little argument that an element is a 'node', but, just for instance, in the XQuery Data Model, it's possible to treat namespaces as not-nodes (detail isn't all that relevant here, I think). Is text content a node? How about ignorable whitespace? Comments, processing instructions? Entity references? The XML declaration? The internal subset? What about the contents of the internal subset (which aren't quite XML, but do have those familiar pointy brackets). SAX has a characters() event--how many of those make up a node? In the DOM, there are namespace attributes (which are nodes); other APIs treat namespaces and attributes as disjoint sets (and the XDM permits namespace bindings to be treated as something approximately like metadata on the tree, with no nodes available to navigate to). Is a document a node? What's a document, then? Is an external parsed entity a node? DOM has 15 'node' types; the infoset has 11; XDM has 7 (this from memory, so I might have fudged a number or two, but the point should remain: the degree of variance indicates a rather slippery term, which means that it's up to you to define what you mean by it). For a book on XML basics, you might reasonably say that a common programmatic representation of XML syntax in memory is as a tree of nodes, but unless you want to descend into the swamp (keep in mind that the XML Infoset spec came along after DOM and SAX and XPath and attempted to unify these three very different models, along with other inputs), it might be best to then innocently mention that what syntactic elements define a node is not well-defined. If you do achieve a definition ... what will it be? 'Node' in common usage indicates participation in a graph--nodes and edges, nodes and connections. But (according to some very popular APIs) there are nodes that are not the children of their parents. There are also nodes that are not visible in the syntax (if you accept that namespaces define nodes this is easy to show: xml:lang="en_US" with no xmlns:xml declaration). The preferred programmatic and algorithmic representations of XML vary both by usage and by the predilections of API designers, and a number of terms (notably including 'node') are overloaded. The *syntax* is core; it's well-defined by a fairly terse collection of BNF in the base specification (which is usually amended by including the namespaces spec, to our sorrow, as I have come to think). How that information is defined for programmatic examination and manipulation varies pretty widely, even among the W3C-produced specifications for XML, and even keeping to a limited set of implementation languages. An event-oriented API (SAX or StAX in Java, for instance) is a reasonable next step. You probably don't want to ignore tree models in a book on basics, but ... arm yourself, for there be dragons. Amy! -- Amelia A. Lewis amyzing {at} talsever.com The less I seek my source for some definitive, the closer I am to fine. -- Indigo Girls
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|