|
next
|
 Subject: Re: Indent XML Tags removes wanted Whitespace Author: (Deleted User) Date: 06 May 2002 12:23 PM
|
At 11.53 06/05/2002 -0400, you wrote:
>From: "Ken Sayers"
>
>Alberto,
>
>Thanks for the response.
>Does the spec state that any processing of text inside a tag like:
>
><?xml version="1.0"?><anytag>xxx </anytag>
>
>should remove leading and trailing spaces? In other words, should we
>expect xsl processors and others to do what studio does? If so, then we
>will have to handle this. We were expecting that the content would be
>left alone when you do the indent tags.
According to the spec (http://www.w3.org/TR/REC-xml#sec-white-space) it's
up to the application to consider whether whitespace is by default
preserved or ignored. In a data-centric application, it's usually ignored;
in a text-processing tool, it's usually preserved.
In the specific case of XSLT processors, they should obey to
http://www.w3.org/TR/xslt#strip (text fragments made only of whitespace are
removed; leading and trailing whitespaces are preserved) but, if the
generated output is HTML, whitespace is collapsed so, according to
http://www.w3.org/TR/html4/struct/text.html#h-9.1
In the end, whitespace handling is subject to a lot of different
interpretations: but, being the purpose of the indenter tool to format an
XML file so that the hierarchy of the tags is clearly shown by manipulating
whitespace characters, it tends to behave more like a data-centric tool,
while keeping an eye on the known XSLT keyword that is known to manipulate
text fragments so that the output of an XSLT transformation done on the
indented text is equivalent to the same transformation done on the original
document.
If you find a situation like that (e.g. the HTML generated by the XSLT
transformation is broken), let us know, and we will change the indenter
accordingly.
Alberto
P.S. Unfortunately the web board software we use has been designed more for
HTML than for XML, so it will not display the XML fragments you write (but
you can see them in the source for the page, and if you subscribe to have
each message sent you by e-mail you will see it with no problem). You will
have to escape the "<" characters by using the "<" entity, or by
using another character (e.g. "{")
|
|
|
|