RE: Upper ASCII chars
> (Jeni) It depends on what processor you're using.
> (Michael) Nevertheless, many people do care, so some
> processors give you a way of controlling it. Saxon has
> an attribute saxon:character-representation, and I
> think Xalan has some kind of configuration file.
I am using Xalan. I'll look for its saxon:character-representation equivalent. That would seem to solve the problem.
> (Jeni) Out of interest, are you experiencing problems with browsers
> recognising the character entity references, or is it purely that you
> don't like the space that they take up, or find them less readable
> than the native characters?
> (David) although if you are writing HTML why do you care? the two
> forms that you show are equivalent to any HTML system.
Unfortunately, it's not an "HTML system". I'm building server-side include pages from an XML configuration file. The parameters for the <SERVLET> block (e.g. <param name="input1" value="£©®ÄËÓáöÿ.DTD">) need to be able to contain both "lower ASCII" and "upper ASCII" characters. value="£" is completely different data for the SSI parser than value="£".
> (Michael) Oh dear: "upper ASCII". There's no such thing. ASCII stops
> at 0x7F. A good first rule in understanding character coding issues
> is to get your terminology straight!
Yes, ASCII is a 7-bit protocol. But in the all the years I've been in this business, when someone says "upper ASCII", everyone else knows what they're talking about. Since my goal was to define my problem, and all three of you seemed to understand the issue, I believe it accomplished its purpose.
> (Jeni) As an alternative, you could change the output method to xml
> and generate well-formed HTML (or full XHTML if you want).
I did try this already, and this led to a different set of problems (mostly formatting related) which I didn't try to address at the time. If I can't get your first suggestion to work, then I'll go back to this option and try to make it work.
> (Jeni) [There's been a recent suggestion on xsl-editors@xxxxxx that > a similar functionality to saxon:character-representation be offered > in XSLT 2.0 - you might want to post this example there to demonstrate > another use case.] > I'll do that this morning. Thanks for the suggestion.
> I get the following in the file: > > <param name="input1" > value="£©®ÄËÓáöÿ.DTD"> > > What I want, though, is: > > <param name="input1" value="£©®ÄËÓáöÿ.DTD"> > > Is there a way to achieve this?
It depends on what processor you're using. The XSLT 1.0 Rec states that if the output method is html and the processor knows the character entity reference for a character, then that character may be output using the character entity reference, which is what you're experiencing.
Some processors, notably Saxon (someone tell me if other processors offer this) give you a bit of control over how you want the characters to be serialized. With Saxon, you can do:
<xsl:output method="html" saxon:character-representation="native;entity" />
to tell Saxon to serialize non-ASCII characters that can be serialized as native characters in your character encoding as native characters, and those that cannot be represented in your character encoding as entities (if Saxon knows such an entity). This should give you the result that you're after (assuming that the characters that you're using are representable within your encoding).
[There's been a recent suggestion on xsl-editors@xxxxxx that a similar functionality to saxon:character-representation be offered in XSLT 2.0 - you might want to post this example there to demonstrate another use case.]
As an alternative, you could change the output method to xml and generate well-formed HTML (or full XHTML if you want). The characters won't be represented as entities in that case because XSLT 1.0 processors can't tell the difference between normal XML and well-formed HTML, so won't escape any of the characters.
Out of interest, are you experiencing problems with browsers recognising the character entity references, or is it purely that you don't like the space that they take up, or find them less readable than the native characters?
--- Jeni Tennison http://www.jenitennison.com/
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format