[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Escaped characters being duplicated

Subject: RE: Escaped characters being duplicated
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 11 Dec 2007 23:22:58 -0000
RE:  Escaped characters being duplicated
Perplexing indeed.

I'd be less surprised if the output came out as "&amp;lt;" rather that
"&lt;&lt;". That's much more common, and could be caused by processing text
twice when it should only be processed once. 

The conversion from "<" to "&lt;" is done by the XML serializer. The fact
that you're using the Saxon XSLT processor doesn't necessarily mean that
you're using the Saxon serializer (the Saxon output could be sent to a DOM
which is then serialized using the DOM serializer); it would be a good idea
to find out what serializer is actually being used. The easiest way to find
out is to see whether the serialization is affected by xsl:output
declarations in the stylesheet.

How did you satisfy yourself that both the successful and the unsuccessful
runs are using Saxon 6.5.5? JAXP is a wonderful beast, and ensures that many
people are running a different XSLT processor from the one they thought they
were using.

Michael Kay
http://www.saxonica.com/

> -----Original Message-----
> From: Anderson, Paul [mailto:Paul.Anderson@xxxxxxxxxxxxx] 
> Sent: 11 December 2007 23:07
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject:  Escaped characters being duplicated
> 
> Greetings All,
> 
> We have a bunch of DITA XML content and we're using the 
> open-source DITA Open Toolkit to transform it into a variety 
> of outputs. The DITA Open Toolkit is a collection of Java 
> classes, XSL stylesheets, and ANT scripts that transform the 
> content and create the output. 
> 
> To shield our users from the command-line invocation of the 
> publishing scripts, we deployed a simple web application 
> running on Tomcat 5.5 that takes input from a JSP page and 
> invokes the necessary ANT script to generate the desired 
> output for the user. This methodology has been working quite 
> nicely for nearly a year.
> 
> Over that time, a few of our users are having a problem where 
> characters escaped in the XML content (for example, angle 
> brackets and ampersands) are duplicated in the output. For 
> example, in the place of one angle-bracket (&lt;), we end up 
> with two or sometimes four escaped angle brackets (&lt;&lt;&lt;&lt;).
> 
> I've been troubleshooting the problem and the duplication 
> always appears in the output files generated by one of the 
> XSL stylesheets in the DITA Open Toolkit. If the input file 
> contained an escaped character, the output file contains two 
> of those escaped characters. The most interesting discovery 
> so far is this: For each user that has the problem, the 
> problem goes away if they invoke the ANT script via the 
> command line; the duplication only occurs when the ANT script 
> is invoked from the JSP page running on Tomcat 5.5. Having 
> said that, the problem only exists for a few users; most 
> users never see this problem when they use the JSP page to 
> invoke the ANT script and publish the exact same XML content.
> 
> Perplexing.
> 
> Given all this background, my plea to this list is simple: 
> What sort of conditions cause an XSL transformation to 
> duplicate an escaped character? 
> 
> Would the system locale have an impact?
> Would the Java version (1.5 versus 1.6) have an impact?
> All source files use UTF-8 encoding.
> All users are using the same XSL processor: Saxon 6.5.5.
> I don't think the problem is in the XSL stylesheet or any 
> other part of the DITA Open Toolkit because all users are 
> using the same code and it works for most users.
> 
> Any ideas about his issue are appreciated.
> 
> Best regards,
> 
> Paul Anderson
> Information Developer - Codex Administrator Compuware 
> Corporation The contents of this e-mail are intended for the 
> named addressee only. It contains information that may be 
> confidential. Unless you are the named addressee or an 
> authorized designee, you may not copy or use it, or disclose 
> it to anyone else. If you received it in error please notify 
> us immediately and then destroy it.

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.