[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: How can the mere switch from DTD to XSD in the sou

Subject: Re: How can the mere switch from DTD to XSD in the source document affect how a stylesheet handles white space?
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 22 Feb 2021 22:31:03 -0000
Re:  How can the mere switch from DTD to XSD in the sou
When a document is validated against a DTD or schema, the fact that an element
is described as having element-only content acts as a signal to treat the
whitespace within that element as ignorable. If everything is working well,
the parser notifies the XSLT processor that the whitespace can be ignored, and
the XSLT processor does so. The change here is not from DTD to XSD, it's from
validation to non-validation.

Michael Kay
Saxonica

> On 22 Feb 2021, at 19:41, Martin Honnen martin.honnen@xxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> On 22.02.2021 20:24, Wolfhart Totschnig wolfhart.totschnig@xxxxxxxxxxx
> wrote:
>
>> After switching from DTD to XSD in my project, I encountered -- apart
>> from the odd behavior of Chrome described in my post from two days ago
>> -- another puzzling problem, namely that, after this switch, one of the
>> stylesheets of my project produced unexpected output, specifically in
>> the handling of white space. I was able to find a solution to the
>> problem. Still, I do not understand how the problem could arise. That
>> is, I do not understand how the mere switch from DTD to XSD in the
>> source file can affect how a stylesheet handles white space. I would
>> like to ask whether one of you can explain it to me.
>>
>> In order to show the phenomenon, I have produced the following minimal
>> example.
>>
>> Before the switch from DTD to XSD, I had the following XML document:
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>> <?xml-stylesheet type="text/xsl" href="../zettel.xsl"?>
>> <!DOCTYPE zettel SYSTEM "../zettel.dtd">
>> <zettel>
>>    <head>
>>       <keywords>
>>          <author>
>>             <first>Sally</first>
>>             <last>Adee</last>
>>          </author>
>>       </keywords>
>>    </head>
>> </zettel>
>>
>> I transformed this document with the following stylesheet:
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>> <xsl:stylesheet version="2.0"
>> xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
>>     <xsl:output method="text" omit-xml-declaration="yes"/>
>>     <xsl:template match="author">
>>         <xsl:text>Some text: </xsl:text>
>>         <xsl:value-of select="."/>
>>         <xsl:text>&#10;</xsl:text>
>>     </xsl:template>
>> </xsl:stylesheet>
>>
>> This produced the following output (with Saxon 9 HE):
>>
>> Some text: SallyAdee
>>
>> Then I switched from DTD to XSD. That is, now the source document looks
>> like this:
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>> <?xml-stylesheet type="text/xsl" href="../zettel.xsl"?>
>> <zettel xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>> xsi:noNamespaceSchemaLocation="../zettel.xsd">
>>     <head>
>>         <keywords>
>>             <author>
>>                 <first>Sally</first>
>>                 <last>Adee</last>
>>             </author>
>>         </keywords>
>>     </head>
>> </zettel>
>>
>> And now the output of the same stylesheet is different:
>>
>>             Some text:
>>                 Sally
>>                 Adee
>>
>> That is, there is a lot of additional white space. (I am here omitting
>> from the output several empty lines.) I figured out that I need to add
>> "<xsl:strip-space elements="*"/>" to the stylesheet to receive the old
>> output with the new source document. But I do not understand why this is
>> necessary, or why it was not necessary before. How can the mere switch
>> from DTD to XSD in the source document change how the stylesheet handles
>> white space? The DTD and the XSD are, as far as I can tell, equivalent
>> (i.e., the XSD is a translation of the DTD), and neither, as far as I
>> can see, says anything about the handling of white space. So I am at a
>> loss.
>
> How exactly do you run Saxon? I think it might depend on the behaviour
> of the underlying parser and I am not sure Saxon HE, as it itself does
> not support schema-aware XSLT, uses any xsi:schemaLocation hint or
> passes it to the XML parser, so basically the schema is ignored, while
> the default setting for XML based parsing might use a parser reading and
> taking the DTD into account. I would expect Saxon EE with
> schema-validation turned on for parsing to give a different output for
> the second sample.

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.