[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: How to copy attribute value to text? (Suspected bu

Subject: Re: How to copy attribute value to text? (Suspected bug involving supplementary characters)
From: "Kenneth Reid Beesley krbeesley@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 9 Jul 2016 02:19:50 -0000
Re:  How to copy attribute value to text? (Suspected bu
> From: Kenneth Reid Beesley <krbeesley@xxxxxxxxx>
> Subject: RE:  How to copy attribute value to text? (Suspected bug
involving supplementary characters)
> Date: 7 July 2016 at 12:23:29 MDT
> To: xslt <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
>
>
>
>
> *****  Suspected bug involving supplementary characters *****
>
> But my real task involves an input XML document, in UTF-8 encoding, that
consists of Deseret Alphabet characters, which are encoded in the
supplementary area.  In such a case, the resulting text content in the <word>
element, copied from an original attribute value, is corrupted.  I saw such
corruption in my own attempts, and couldnbt understand what was happening.
>
> Using the following input document (the Deseret Alphabet characters may not
display correctly for you)
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> <foo>
>   <bar>pp.p p.p p>p2pp; <word
correction="p;p-">pp/p	p.</word> pp2pp.</bar>
> </foo>
>
> the output, using your script, is corrupted.  The text() value in the output
is not the same as the original @correction value.  Extra characters (just one
in this case) are inserted.  The longer the original attribute value, the more
extra characters are inserted.
>
> <?xml version="1.0" encoding="UTF-8"?>
> <foo>
>   <bar>pp.p p.p p>p2pp; <word
origerror="pp/p	p.">p;p;p-</word> pp2pp.</bar>
> </foo>
>
> This kind of corruption is exactly what I was seeing using my own scripts,
leading me to bother the group.
>
> I suspect a bug in the XSLT engine involving supplementary characters.
Again, Ibm using SaxonHE9-7-0-6J.
>
> Whatbs my next step?
>
> Thanks,
>
> Ken
>
>
>
> From: Michael MC<ller-Hillebrand <mmh@xxxxxxxxx>
> Subject: Re:  How to copy attribute value to text? (Suspected bug
involving supplementary characters)
> Date: 7 July 2016 at 14:20:30 MDT
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
>
>
> When copying the data and stylesheet into OxygenXML and also enabling bidi
support, the XSLT processing works fine.
>
> <?xml version="1.0" encoding="UTF-8"?>
> <foo>
>   <bar>pp.p p.p p>p2pp; <word
origerror="pp/p	p.">p;p-</word> pp2pp.</bar>
> </foo>
>
> So your problems may come form some details in your setup? How are you
running the transform?
>
> BTW, interesting letters!
>
> - Michael


I _was_ running the transform with the default JDK XML parser (Java 1.8).
Ibm using SaxonHE9-7-0-6J.
This default JDK parser is reputed to be buggy.


>
>
> From: Michael Kay <mike@xxxxxxxxxxxx>
> Subject: Re:  How to copy attribute value to text? (Suspected bug
involving supplementary characters)
>
>
> More likely to be a bug in the JDK parser. Try it using Apache Xerces, which
is much more reliable than the JDK parser. I think some of the long-standing
bugs in the JDK parser have finally been fixed in Java 8, so you could also
try it with a different JDK.
>
> Michael Kay
> Saxonica


Michael Kay is right.  I changed to using the Xerces-J parser and now
everything works as expected.

By the way, I found it a little difficult to figure out how to use Saxon and
specify the xerces parser.
I had to hunt around a bit.  I finally found the following incantation (as
coded in my Makefile).


# using Saxon XSLT with the Xerces-J parser
BoMDA1869c.xml: BoMDA1869.xml BoMDA1869c.xsl
	java
-Djavax.xml.parsers.DocumentBuilderFactory=org.apache.xerces.jaxp.DocumentBui
lderFactoryImpl \
  -Djavax.xml.parsers.SAXParserFactory=org.apache.xerces.jaxp.SAXParserFactor
yImpl \
  net.sf.saxon.Transform -o:$@  $<  BoMDA1869c.xsl


I have saxon9he.jar and xercesImpl.jar on my CLASSPATH.  It all seems to work.
Am I missing anything?

Many thanks to all who responded to my question.

Ken

********************************
Kenneth R. Beesley, D.Phil.
PO Box 540475
North Salt Lake UT 84054
USA

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.