RE: MSXML Whitespace handling
At 13:51 01/08/00 -0700, Andrew Kimball wrote: >As for mangling by default, that is a beef with the design of the MS DOM, >not with the conformance of MS XSL. The MS DOM defaults towards performance >and low memory consumption, while still staying within the XML 1.0 spec. I >think it was the right decision for the vast majority of users. Users who >need to preserve whitespace can always set preserveWhiteSpace=true when >loading the DOM, or use xml:space="preserve" to tag significant whitespace. As Andy says, it is a beef with the design of the MS DOM rather than MS XSL. >From a standards point of view, it all comes down to whether MS DOM is counted as an XML processor or an XML application. The XML Recommendation states: "A software module called an XML processor is used to read XML documents and provide access to their content and structure. It is assumed that an XML processor is doing its work on behalf of another module, called the application. This specification describes the required behavior of an XML processor in terms of how it must read XML data and the information it must provide to the application." Andy said: "The application responsible for parsing the input XML and building the tree cache is the DOM, not XSLT. Therefore, it is perfectly reasonable to view the DOM as the "application" referred to in the XML 1.0 spec." It seems the job of MS DOM is to read in (parse) and provide access to the content and structure of the XML document: squarely in the preserve of the 'XML processor' rather than the 'XML application'. (If that's not the case, how does MS DOM *apply* the information in the XML document as a standalone application?) It seems to me that it is MS XSL that actually performs some action as a result of the XML: MS XSL is an XML application, MS DOM is an XML processor. In the section on Whitespace Processing (2.10) the XML Recommendation states: "An XML processor must always pass all characters in a document that are not markup through to the application. A validating XML processor must also inform the application which of these characters constitute white space appearing in element content." Given that MS DOM is an XML processor, it should be passing the whitespace within xsl:text through to MS XSL so that it can deal with it properly. >From a usability point of view, in my experience one of the main uses of xsl:text is to add whitespace in some output. I'm sure that it makes MS DOM quicker and leaner not to worry about whitespace, but it seriously detracts from its utility as a XML processor to be used by an XSLT Processor like MS XSL. If there was a normative XSLT DTD, and the XSLT DTD specified: <!ATTLIST xsl:text xml:space (preserve) #FIXED 'preserve'> then presumably MS DOM would preserve the whitespace within xsl:text. As it is, the DTD that is supplied within the XSLT Recommendation is non-normative and I imagine that most XSLT processors decide what to do on the basis of an implicit understanding of the intention behind the definitions given within the XSLT Recommendation rather than relying on an explicit DTD. It is clearly the intention within [http://www.w3.org/TR/xslt#strip] that xsl:text should preserve whitespace; XML applications that deal with XSLT should treat these elements as if they had xsl:space="preserve" declared on them. As a compromise, could MS DOM treat xsl:text as if xml:space="preserve" were defined on it? Perhaps unfortunately, because it would be nice if a small compromise were all that's needed, the rules governing whether whitespace is significant within XSL elements is more complex that whether an element has xml:space="preserve" or even whether it's an xsl:text element. In XSLT, you can define elements within which whitespace should be preserved using xsl:preserve-space (in combination with xsl:strip-space). If MS XSL is not given sufficient information to process these elements according to the XSLT Recommendation, then these elements are useless when used with it. A larger compromise would involve MS DOM treating all mixed-content and #PCDATA XSLT elements as if xml:space="preserve" were defined on them. However, for true compliance as a XML processor, to avoid spurious exceptions for XSLT elements, and to enable MS XSL (and, eventually, other XML applications) to perform in a useful and compliant manner, MS DOM should preserve whitespace by default. If MS DOM does not, MS XSL should use a conformant XML processor instead, to enable it to conform to the XSLT Recommendation. My 10p worth :) Cheers, Jeni Dr Jeni Tennison Epistemics Ltd * Strelley Hall * Nottingham * NG8 6PE tel: 0115 906 1301 * fax: 0115 906 1304 * email: jeni.tennison@xxxxxxxxxxxxxxxx XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format