[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: unreadable characters from indesign

Subject: Re: unreadable characters from indesign
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Wed, 17 Jan 2007 23:47:47 +0100
unreadable characters
Marc Lambrichs wrote:
I'm reading in an xml-feed from Adobe InDesign and in some nodes there are three characters that can't be interpreted by my xsl-translation using utf-8. The codepoints of these 3 are (octal) 226, 128, 169. First of all, I would like to know what these characters should represent. And secondly, could I filter these characters out using something like translate?


This is not possible. Of the range 226, 128 and 169 are octal, you mistyped at least the digits '8' and '9'.


Assuming you meant decimal, and you are talking about codepoints indeed, then there cannot be any problem in reading it, the codepoints 226, 128 and 169 represent the string b&#128;) (not sure the mailer messes this up), which are:

U+00E2, LATIN SMALL LETTER A WITH CIRCUMFLEX
U+0080, control
U+00A9, COPYRIGHT SIGN

See http://www.unicode.org/Public/UNIDATA/UnicodeData.txt for a full list of codepoints.

In UTF-8, this is encoded as the following octets (view your input hexadecimal and you can see if this is indeed correct):
U+00E2 >>> C3A2
U+0080 >>> C280
U+00A9 >>> C2A9


I am not sure what you mean with "can't be interpreted by my xsl-translation using utf-8", because any valid XSLT processor understands at least UTF-8 and UTF-16. However, if what you mean is that these characters are there and should be removed, you can indeed use translate() to remove them:

translate($yourinput, '&#226;&#128;&#169", '')

But if what you mean is that the input has somehow these three values encoded in such a way that it is not UTF-8, then you will have to change your input, because it is not possible to process non-UTF-8 (meaning: containing illegal utf-8 sequences) as if it were UTF-8.

Cheers,
-- Abel Braaksma
  http://www.nuntia.nl

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.