Subject:#documentFragment, #document and XML efficiency Author:Mark Smith Date:20 Apr 2005 11:37 AM
Hi,
I guess to start with, I’ll tell you what it is I’m trying to do.
I have a Java Servlet and from this servlet I want to be able to pass 3 separate XML documents into an XSLT transform. I don’t want to block the XML together under one root and pass it in as 2 of the 3 XML sources I have no control over and do not know what their content could be. Therefore I would like to keep all 3 XML sources separate so that each can provide its own XSD for the XSLT developer to work with. I can’t create one fixed scheme as I don’t know the content of 2 of the XML sources.
Anyway, I have been taking these XML streams and using the DOMParser to get a DOM for each of these XML files. Then I have been passing these DOMs into the transform using setParameter. I have then been able to use my xsl:params in my XSLT and all is good!
But I’m aware of the DOM being inefficient and I don’t really want to be converting my XML streams to DOMs just so I can get them into the XSLT world. I don’t perform any operations on these XML streams in my servlet, I just pass them straight into the transform. So I would like a way of getting my XML streams into the transform without using the DOM.
So I tried passing in my XML into setParameter as a string which resulted in my xsl:param being created as a #documentFragment instead of a #document. So I read around the documentFragment and found that it seems to be a slimed down more efficient fragment of a document. The question is how can I use this #documentFragment like a #document, it just seems to be a string and have no context??? Do I have to load this #documentFragment into a #document before I can use it? If so how do I do that in XSLT? Any help or suggestions would be greatly appreciated!!! I’m all ears! ;-)
The XSLT I have been playing with within Stylus is as follows (I just set my parameters instead of passing them in via my servlet so that I can step through in the Stylus debugger)….
<!-- XML from transform -->
SCOTCH: <br/>
<xsl:for-each select="BoozeXMLFile/Scotch">
<xsl:value-of select="."/>,
</xsl:for-each>
<br/><br/>
<!-- XML from document load -->
TEQUILA: <br/>
<xsl:for-each select="$XMLDocument/BoozeXMLFile2/Tequila">
<xsl:value-of select="."/>,
</xsl:for-each>
<br/><br/>
<!-- XML from String -->
BEER: <br/>
<!-- *** The following String is displayed OK *** -->
The XML String : <xsl:value-of select="$XMLString"/><br/>
<!-- *** But this kind of thing does not work on a documentFragment *** -->
<xsl:for-each select="$XMLString/BoozeXMLString/Beer">
<xsl:for-each select=".">
</xsl:for-each>
<br/><br/>
As described in the documentation setParameter is implementation dependant.
For XalanJ (the processor bundle with Java) for instance the following types are supported:
- primitive (String, Boolean, Double)
- DOM node (org.w3c.dom.Node)
- XPath expression result (org.w3c.dom.traversal.NodeIterator)
Unfortunately InputSource is not supported.
I wouldn't worry much about DOM inefficient anyway. The XSLT processor creates always a DOM internally; if you pass a pre-built DOM, the processor will use that.
Subject:#documentFragment, #document and XML efficiency Author:Mark Smith Date:20 Apr 2005 05:12 PM
Hi Ivan,
Thanks for the reply!
So anytime I get a #document in the context of my Stylus Studio Debugger does that mean the XSLT processor has created a DOM internally?
Does this mean if I do a document(“datafile.xml”) it will create a DOM, and if I get a nodeset it will create a DOM etc.???
And for me to do any kind of processing of XML data within an XSLT do I HAVE to have a #document (DOM) of that XML data, can I not use a #documentFragment or something else.
My colleague has it that “a DOM is inefficient and we don’t need one” and tasked me with coming up with an alternative, I’ve drawn a blank so far, I keep coming back to the DOM, so can I tell him there really is no alternative to using the DOM?
Subject:#documentFragment, #document and XML efficiency Author:Ivan Pedruzzi Date:20 Apr 2005 10:15 PM
So anytime I get a #document in the context of my Stylus Studio Debugger does that mean the XSLT processor has created a DOM internally?
Does this mean if I do a document(“datafile.xml”) it will create a DOM, and if I get a nodeset it will create a DOM etc.???
***** Ivan:
Almost all XSLT processors need to build a DOM tree in memory before do anything with XML.
Only a very small number of implementations are designed on top of XML databases use a different approach.
*****
And for me to do any kind of processing of XML data within an XSLT do I HAVE to have a #document (DOM) of that XML data, can I not use a #documentFragment or something else.
***** Ivan:
Document fragments are still DOM structures.
Trying to bind a string to a variable will not work either.
*****
My colleague has it that “a DOM is inefficient and we don’t need one” and tasked me with coming up with an alternative, I’ve drawn a blank so far, I keep coming back to the DOM, so can I tell him there really is no alternative to using the DOM?
***** Ivan:
When you are binding XML documents to global parameters you have to create a DOM.
If you prefer to rely entirely on the processor an alternative approach would be to serialize the incoming streams to disk then assign the filenames to the parameters.
Subject:#documentFragment, #document and XML efficiency Author:Mark Smith Date:21 Apr 2005 11:54 AM
Thanks for your input Ivan!
I did read that a lot of transform processors don’t actually use a DOM internally due to their performance. Apparently, a lot of the new transform processors create an optimized internal cached tree structure of the DOM that they process on. So with this in mind I’m going to look into SAX for getting my XML into the transform world as this event driven system is meant to be quicker than the DOM approach.