[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Aw: Re: How to copy attribute value to text? (Suspecte

Subject: Aw: Re: How to copy attribute value to text? (Suspected bug involving supplementary characters)
From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 7 Jul 2016 20:22:13 -0000
Aw: Re:  How to copy attribute value to text? (Suspecte
I think you can file problems at https://saxonica.plan.i
o/projects/saxon/issues, but make sure you mention the Java version and
the way you use Saxon (command line, Api)
--
Diese Nachricht wurde von meinem Android Mobiltelefon mit GMX Mail
gesendet.Am 07.07.2016, 20:54, "Kenneth Reid Beesley krbeesley@xxxxxxxxx"
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> schrieb:

  From: Kenneth Reid Beesley <krbeesley@xxxxxxxxx>
  Subject: Re: [XSL-List: The Open Forum on XSL] Digest for 2016-07-06
  Date: July 7, 2016 at 12:43:54 PM EDT
  To: "XSL-List: The Open Forum on XSL" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>

  Many thanks to Martin Honnen for his response below.  I add more
  comments below (suspected bug in Saxon).

    On 7Jul2016, at 05:28, XSL-List: The Open Forum on XSL <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
    wrote:
    From: Martin Honnen <martin.honnen@xxxxxx>
    Subject: Re:  How to copy attribute value to text?
    Date: 7 July 2016 at 00:43:37 MDT
    To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx

    On 07.07.2016 07:22, Kenneth Reid Beesley krbeesley@xxxxxxxxx
    wrote:

      If I start with an input XML document that contains mixed
      text with <word> elements like this:

      &hellip; this is just <word
      correction=&ldquo;too&rdquo;>to</word> funny

      I&rsquo;d like to write an XSLT stylesheet that yields as
      output

      &hellip; this is just <word
      origerror=&ldquo;to&rdquo;>too</word> funny

      So in the output I effectively want (in the same <word>
      element) to

      1.  Set the value of a new attribute to the original text()
      value, and
      2.  Reset the text() value to be the value of the original
      @correction attribute

      I&rsquo;ve tried many variants of the following, so far
      without success.  I&rsquo;m using SaxonHE9-7-0-6J;
      it runs, but the results are not as expected/hoped.

      I&rsquo;ve tried matching the text() in a separate template,
      but I can&rsquo;t seem to reference the attribute values of
      the parent node (i.e., <word>) of the text() and the parent
      node&rsquo;s attributes.  E.g, the following doesn&rsquo;t
      work for me, failing somehow in the
      select=&ldquo;../@correction&rdquo;  reference.

      <xsl:template match=&ldquo;word[@correction]/text()&rdquo;>
      <xsl:value-of select=&ldquo;../@correction&rdquo;/>
      </xsl:template>

    You can use

    <xsl:template match="@* | node()">
    <xsl:copy>
    <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
    </xsl:template>

    <xsl:template match="word[@correction]/text()">
    <xsl:value-of select="../@correction"/>
    </xsl:template>

    <xsl:template match="word/@correction">
    <xsl:attribute name="origerror" select=".."/>
    </xsl:template>

  Your solution looks perfect and appears to work perfectly for
  ASCII-based XML input examples like the following
  <?xml version="1.0" encoding="UTF-8"?>
  <foo> <bar>this is just <word correction="too">to</word> funny</bar>
  </foo>
  yielding the correct/desired output
  <?xml version="1.0" encoding="UTF-8"?> <foo> <bar>this is just <word
  origerror="to">too</word> funny</bar> </foo>

  I now see that some of my own attempts also worked, on the same
  ASCII-based example.
  *****  Suspected bug involving supplementary characters *****
  But my real task involves an input XML document, in UTF-8 encoding,
  that consists of Deseret Alphabet characters, which are encoded in
  the supplementary area.  In such a case, the resulting text content
  in the <word> element, copied from an original attribute value, is
  corrupted.  I saw such corruption in my own attempts, and
  couldn&rsquo;t understand what was happening.
  Using the following input document (the Deseret Alphabet characters
  may not display correctly for you)
  <?xml version="1.0" encoding="UTF-8"?>
  <foo> <bar>pp.p p.p p>p2pp; <word
  correction="p;p">pp/p	p.</word> pp2pp.</bar>
  </foo>
  the output, using your script, is corrupted.  The text() value in the
  output is not the same as the original @correction value.  Extra
  characters (just one in this case) are inserted.  The longer the
  original attribute value, the more extra characters are inserted.
  <?xml version="1.0" encoding="UTF-8"?> <foo> <bar>pp.p
  p.p p>p2pp; <word
  origerror="pp/p	p.">p;p;p</word>
  pp2pp.</bar> </foo>
  This kind of corruption is exactly what I was seeing using my own
  scripts, leading me to bother the group.
  I suspect a bug in the XSLT engine involving supplementary
  characters.  Again, I&rsquo;m using SaxonHE9-7-0-6J.
  What&rsquo;s my next step?
  Thanks,
  Ken
  ******************************** Kenneth R. Beesley, D.Phil. PO Box
  540475 North Salt Lake UT 84054 USA

  ******************************** Kenneth R. Beesley, D.Phil. PO Box
  540475 North Salt Lake UT 84054 USA

  XSL-List info and archiveEasyUnsubscribe (by email)

XSL-List info and archiveEasyUnsubscribe (by email)

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.