[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Combining use-character-maps and normalization-fo

Subject: Re: Combining use-character-maps and normalization-form="NFC" attributes produce unwanted output
From: "lancelot.meurillon@xxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 16 Feb 2016 16:43:36 -0000
Re:  Combining use-character-maps and  normalization-fo
Thanks Wolfgang.
I raised an issue => https://saxonica.plan.io/issues/2622

Lancelot

From: Wolfgang Laun wolfgang.laun@xxxxxxxxx
[mailto:xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx]
Sent: vendredi 12 fC)vrier 2016 16:42
To: xsl-list
Subject: Re:  Combining use-character-maps and normalization-form="NFC"
attributes produce unwanted output

Even the solitary identity transformation of the semicolon 0x3B
     <xsl:output-character character=";" string=";"/>
results in a translation to U+037E of all semicolons. Seems to be a bug.

 SaxonHE 9.6.0.1

On 12 February 2016 at 15:29,
lancelot.meurillon@xxxxxxxx<mailto:lancelot.meurillon@xxxxxxxx>
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx<mailto:xsl-list-service@xxxxxxxxxxxx
rytech.com>> wrote:
XSL processor : Saxon-EE 9.5.1.8J from Saxonica

XSL version : 2.0



Dear all,



For some reasons, I need to escape specific characters in the output and also
need to produce normalised Unicode in NFC.

Here is my input :

<inputText>b; ;</ inputText >  => which is \u201D + \u003B + \u0020 + \u003B



Here is the output properties of my stylesheet :

<xsl:output method="xml" version="1.0" encoding="UTF-8"

        indent="yes" omit-xml-declaration="no"

        use-character-maps="unsupported_characters"

        normalization-form="NFC"

    />



The character-map definition :

<xsl:character-map name="unsupported_characters">

        <xsl:output-character character="&#8220;" string="&quot;"/>

        <xsl:output-character character="&#8221;" string="&quot;"/>

    </xsl:character-map>



With this template :

<xsl:template match="/ ">

    <shortDescription><xsl:value-of select=" inputText "/></shortDescription>

</xsl:template>



Now the output :

<shortDescription>"M> ;</shortDescription> => which is \u0022 + \u037E +
\u0020 + \u003B



Why the semicolon (\u003B) is translated into Greek question mark (\u037E)
just after the escaped quote while the next semi colon is kept ?

But the right question is why my semicolon is escaped into Greek question mark
?



Just to go further :

1- If I do not use character-map the result is :

<shortDescription>b; ;</shortDescription> => which is \u201D + \u003B +
\u0020 + \u003B



2- If I do not normalize the Unicode (without normalization-form="NFC"
attribute)

<shortDescription>"; ;</shortDescription> => which is \u0022 + \u003B + \u0020
+ \u003B



Thanks for the help

Lancelot










XSL-List info and archive<http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe<-list/2831320> (by email<>)

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.