DOM (Document Object Model)
Stylus Studio® XML Enterprise Suite deals with all of the issues of DOM and SAX for you when you deal with parsing XML and connecting together stages in an XML pipeline. But sometimes it's useful to know how all the pieces of DOM fit together. This page is designed to give a brief overview of the Document Object Model, or DOM, along with comparisons of the three levels of DOM available.
One of the problems with DOM is that it was designed to be a cross-platform
interface, which basically means the way it has of expressing the XML model
is not optimal for any single platform. Thus instead of any local language idiom
— such as collections, hashes, sets or maps —
there are artificial constructs such as
Note that this document is only concerned with the XML DOM, and not the HTML DOM. However, many of the same comments apply to the latter. Also, this document only deals with W3C recommendations, drafts or even candidate recommendations are not included.
DOM Basic Types
The DOMObject type represents a native object (or Object) type, as in ECMAScript or Java.
The DOMString type is implemented as an array of 16-bit characters from the UTF-16 Unicode character set. It corrsponds to the String type in both Java and ECMAString, or (unofficially) the wstring type in some C/C++ implementations.
DOMTimeStamp (2) (3)
The DOMTimeStamp is used to store either a time or a duration, measured in milliseconds. It must be implemented either as a Date type or as at least a 64-bit integer.
This is a reference to user-specific data. In Java, it is an Object, and in ECMAScript, it is an any.
DOM Structure Model
Attr objects correspond to the attributes of an element. However, they are not part of the DOM tree, and therefore have no parentNode, previousSibling or nextSibling. An element can access them by means of specific get, set and remove methods. Oddly enough, they are extensions of Node.
CDataSection (1) (2) (3)
This represents CDATA sections in the source XML.
CharacterData (1) (2) (3)
This represents the content of an XML comment. It does not include the actual <!-- and --> markers, but only what is between them. It is also a subclass of CharacterData.
This is the container of the document. It's one child element would be the first, or root-level, element in the XML document. This interface contains many methods which operate on the document as a whole or which require the document itself for context.
DocumentFragment (1) (2) (3)
This is the "diet" version of a Document object, useful for passing around subtrees which may or may not contain children.
DocumentType (1) (2) (3)
Any entities within the XML document, either parsed or unparsed, are represented by instances of an object defined with this interface. If available, this can include the public and system indentifiers as well as the encoding and XML version. Entity nodes and their children are read-only.
EntityReference (1) (2) (3)
An EntityReference represents an entity reference in the XML tree. Note that character
references and predefined entities (such as
NamedNodeMap (1) (2) (3)
A NamedNodeMap is an unordered collection of names that correspond to Nodes.
Node is the fundamental interface for the DOM — almost every object in the DOM structure inherits directly or indirectly from Node, even those that do not support child nodes.
The NodeList is just an ordered collection of Nodes. Some implementations reportedly implement elements as both Nodes and NodeLists, so be wary of what object you actually receive from the DOM.
This represents a notation as declared in a DTD. They cannot be created through the DOM interface.
ProcessingInstruction (1) (2) (3)
This just corresponds to a PI in the XML. Note that even though the
This represents the character data of an element or attribute. There are some gotchas when dealing with text nodes; it is possible to have several adjacent ones that may need to be normalized, and mixed CDataSection and Text nodes may not behave as you would expect, event after normalization.
The configuration of the DOM is located here. A set of SAX-like properties is used, including
The DOMError interface describes the type of error encountered, along with additional details such as the severity
A DOMErrorHandler describes a callback interface used when an error occurs. It is set through the DOMConfiguration interface.
DOMException (1) (2) (3)
For languages that support exception handling, the DOMException is used when something truly exceptional (pardon the pun)
happens. Normal errors would return normal exceptions, such as for out-of-bound errors when dealing with arrays, or null-reference exceptions when
the user passes a
DOMImplementation (1) (2) (3)
The DOMImplementation interface is used to expose methods that are used to create DOM trees and (starting in DOM Level 3) set or query the features available within the DOM.
This is just a list of available DOM implementations.
The DOMImplementationSource interfaces provides a way for a user program to request a DOM implementation that implements a certain set of user-specified features.
This points to a location in a document, such as the point at which an error occurred. It includes fields the line and column
number, the byte offset into the stream, the URI of the document, and more. If one or more of these are not available, they will
be set to
This is just an ordered list (e.g. an array) of DOMString values.
ExceptionCode (1) (2) (3)
This is an enumeration of exception codes:
DOM Levels 1, 2 and 3:
DOM Levels 2 and 3:
DOM Level 3:
This interface describes an ordered list of names each of which corresponds to a namespace.
The UserDataHandler interface defines a way for an application to get a callback whenever a node is cloned, imported or renamed.
This is the base interface for any specialized views of the DOM.
This is an alternate view of the DOM. Perhaps this could be used to represent the DOM after a CSS transformation has occurred, or based on some other transformation or presenetation of the underlying physical DOM.
To fire an event of a certain type against the document, use the method supplied by this interface.
This is the base interface for any type of event against the DOM. This is what is passed to an event handler.
Events that fail may throw this type of exception. Typically it will return
If the event model is supported by a DOM, all participating nodes will also implement the EventTarget interface to denote they can be recipients of events. Via this interface, event listener code can be attached.
This is a type of user interface event that describes a mouse action.
A mutation event is an event which changes the structure of the DOM. The various types of defined
This is the generic interface for user-interface-related events. It is anticipated that in addition to the mouse event interface, a KeyEvent interface will be supplied in a future draft.
These do not apply to the XML DOM, but only to the HTML DOM, with the exception of one footnote. This one exception is the processing instruction which triggers a browser to load an XSLT document and apply it to the current XML document. That PI looks like this:
There are other optional pseudo-attributes, including
This allows a list of nodes matching some criteria to be stepped through one at a time, in document order, both forwards and backwards. It is valid until detached even if the underlying document changes.
The TreeWalker interface allows you to navigate the DOM using the filtering but in a tree-like fashion instead of the list-like fashion of the NodeIterator. It also supports returning only certain types of nodes.
This interface exposes the method for creating a Range object.
A range describes some portion of a document starting and stopping at specific locations. This is not just a subtree, as it may start anywhere, even in the middle of text content, and end anywhere. The only limitation is that both starting and ending points must have a containing object that is a common ancestor. The two boundary points can be as close as being within the same string, or as far apart as the starting and ending objects of the document.
DOM Load and Save
This adds to the DOMImplementation interface new methods for creating loading and saving objects.
This denotes an input source for loading. It supports both character and byte streams, and not having those will try to resolve the string literal, then the system ID, then the public ID.
This denotes a binary input stream for loading as XML.
This is a type of event that singals the end of a load.
This is the destination for XML output. It must contain either a character stream, a byte stream or a system ID.
This denotes a binary output stream for saving as XML.
The filter can intercept nodes as they are being parsed. It can abort the parsing, or inject, modify or remove nodes on the fly as the DOM is being built.
This is a callback that can show the progress of parsing.
This denotes a sequence of characters to be read as XML for loading.
A resolver can redirect references to external objects. A URI and other related information are passed in, and a LSInput object is returned that corresponds to that resource if it is available.
The serializer writes a DOM or in fact any node type to an LSOutput destination. It will fix up namespaces and escape characters as necessary.
This denotes a sequence of characters to be written as XML.
This subclasses the NodeEditVAL to expose additional methods useful for textual processing, such as whether this node is all whitespace or not.
This specifies properties such as whether the document must be continually re-validated as each change is made to the DOM and also includes methods for force validation.
Some of the validation operations may throw this exception — typically when asked to validate before a schema is attached.
This is the basis for all node-oriented validation interfaces, and supports checking for well-formedness as well as schema compliance. It surfaces enumerated values when appropriate as well as the default value for an element or attribute.
The End of DOM
Yeah, with all these methods, we wish. However, the DOM is useful, and Stylus Studio® XML Enterprise Suite contains many powerful tools based on the DOM model which provide many of the benefits the above interfaces imply — guide editing, on-the-fly validation, support of multiple encodings and then much higher-level constructs like XML Pipeline Tools and XML Schema Tools. Investigate today why Stylus Studio® is the choice of serious XML developers by downloading your free evaluation copy.
PURCHASE STYLUS STUDIO ONLINE TODAY!!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Try Stylus DOM XML Tools
Only Stylus Studio leverages the Document Object Model API for XML in our tools and processing components - Download a free trial today!
Attend a Live Webinar This Week!
Learn about Stylus Studio's unique features and benefits in just under an hour. Register for the Stylus Studio QuickStart Training WebCast!