[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: DOM versus XDM: Differences in handling CDATA sections,en

  • From: Michael Kay <mike@saxonica.com>
  • To: xml-dev@lists.xml.org
  • Date: Fri, 12 Nov 2010 17:03:54 +0000

Re:  DOM versus XDM:  Differences in handling CDATA sections

DOM and XDM are not the only models of XML: there's also JDOM, DOM4J, 
XOM, etc. They are all different in various respects. Also, most of them 
have options and levels. Also, DOM is defined as an API whereas XDM is 
defined as an object model. So XDM doesn't discuss API-oriented issues 
such as thread-safety.

Schema processors often work in pure streaming mode, without building a 
tree representation of the document in memory.

The XSLT 2.0 specification is defined in terms of XDM, but actual 
products may use any of these models underneath, performing the 
necessary mappings (e.g. expanding entity references and masking CDATA 
sections) as required. Many XSLT processors have their own internal tree 
model which will have some kind of relationship to the XDM model used in 
the specification, but often not a direct representation - for example, 
a naive implementation of the way XDM describes namespaces would be 
horrendously inefficient.

Michael Kay
Saxonica

On 12/11/2010 16:38, Costello, Roger L. wrote:
> Hi Folks,
>
> My understanding is that an XML document is first processed by an XML parser, which creates an in-memory tree representation of the XML document. Then, an application such as an XML Schema validator or an XSLT processor operates on the in-memory tree representation. Here is a simple graphic I created to show this:
>
> http://www.xfront.com/DOM-versus-XDM/How-an-XML-document-is-processed.gif
>
>
> It is my understanding that the in-memory model created by different XML parsers may be different, depending on whether the XML parser creates a DOM or XDM in-memory model.
>
> Here are two places where differences arise:
>
>     - CDATA sections
>     - Entities
>
> Also, there are differences with respect to:
>
>     - Concurrent access
>
>
> ------------------------------
> CDATA SECTIONS: DOM VERSUS XDM
> ------------------------------
>
> This XML document contains a CDATA section:
>
>      <root>
>          hello<![CDATA[if A<  B then ...]]>  world
>      </root>
>
> As mentioned, there are two ways to model XML documents:
>
>    - Document Object Model (DOM) [1]
>
>    - XML Data Model (XDM) [2]
>
> The two models represent the above XML document differently:
>
>     - A DOM tree will have a node for the CDATA section, as evidenced by
>       the fact that the DOM API has a method for accessing CDATA sections [3].
>       Here is a graphic I created to show the DOM tree for the XML document:
>
>       http://www.xfront.com/DOM-versus-XDM/DOM-implementation-of-CDATA.gif
>
>     - An XDM tree does not have a node for the CDATA section. The CDATA
>       section is resolved; i.e., the contents of the CDATA section is
>       merged with the surrounding text.
>       Here is a graphic I created to show the XDM tree for the XML document:
>
>       http://www.xfront.com/DOM-versus-XDM/XDM-implementation-of-CDATA.gif
>
>
> Notice that in the DOM tree there are three nodes under the Element node, whereas in XDM there is only one Text node under the Element node.
>
> ------------------------------
> ENTITIES: DOM VERSUS XDM
> ------------------------------
>
> This XML document uses an entity:
>
>      <root>
>          hello if A&lt; B then ... world
>      </root>
>
> DOM and XDM represent entities differently:
>
>     - A DOM tree will have a node for the entity, as evidenced by
>       the fact that the DOM API has a method for accessing entities [4].
>       Here is a graphic I created to show the DOM tree of the XML document:
>
>       http://www.xfront.com/DOM-versus-XDM/DOM-implementation-of-entities.gif
>
>     - An XDM tree does not have a node for the entity. The entity
>       is resolved; i.e., the entity is replaced by its replacement
>       text and is merged with the surrounding text.
>       Here is a graphic I created to show the XDM tree of the XML document:
>
>       http://www.xfront.com/DOM-versus-XDM/XDM-implementation-of-entities.gif
>
>
> Notice that in the DOM tree there are three nodes under the Element node, whereas in XDM there is only one Text node under the Element node.
>
> ---------------------------------
> CONCURRENT ACCESS: DOM VERSUS XDM
> ---------------------------------
>
> There are occasions where multiple applications (processes) need to operate on the same in-memory tree. Recently, Hans-Juergen Rennau reported [5] problems with concurrent access to DOM trees. He found no problems with concurrent access to XDM trees.
>
> I then learned [6] that the DOM specification does not require implementations provide a thread-safe DOM API; i.e., it does not require that concurrent access to a DOM tree be properly synchronized.
>
>
> QUESTIONS
>
> 1. Is the above description and graphic of how XML documents are processed correct?
>
> 2. Is the above description and graphics of the differences in how CDATA sections and entities are represented in DOM and XDM correct?
>
> 3. Is the above description of the differences in thread-safety of DOM and XDM correct?
>
> 4. Will applications behave differently depending on whether the XML parser it uses generates DOM or XDM? If so, isn't that really bad?
>
> 5. Do XML Schema validators use DOM or XDM to represent the XML Schema and the XML instance document?
>
> 6. If I were to create my own XML Schema validator, do I have the option of choosing to use DOM or XDM? Or, does the XML Schema specification require me to use one of them? If so, which one?
>
> 7. Do XSLT processors use DOM or XDM to represent the XSLT document and the XML instance document?
>
> 8. If I were to create my own XSLT processor, do I have the option of choosing to use DOM or XDM? Or, does the XSLT specification require me to use one of them? If so, which one?
>
> 9. For each of the following products, does it use DOM or XDM?
>
> XML Schema Validators
>
>     - XERCES: DOM or XDM?
>
>     - SAXON: DOM or XDM?
>
>     - XSV: DOM or XDM?
>
>     - MSXML: DOM or XDM?
>
>     - LIBXML: DOM or XDM?
>
> XSLT Processors
>
>     - XALAN: DOM or XDM?
>
>     - SAXON: DOM or XDM?
>
>     - MSXML: DOM or XDM?
>
>     - XSLTPROC: DOM or XSD?
>
>
> /Roger
>
>
> [1] http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html
>
> [2] http://www.w3.org/TR/xpath-datamodel/
>
> [3] http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html#ID-667469212
>
> [4] http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html#ID-11C98490
>
> [5] http://sourceforge.net/mailarchive/forum.php?thread_name=4CDBE667.2050400%40saxonica.com&forum_name=saxon-help
>
> [6] http://sourceforge.net/mailarchive/message.php?msg_name=4CDBE667.2050400%40saxonica.com
>
>
>
> _______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.