[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Safe-guarding codepoints-to-string() from wrong in

Subject: Re: Safe-guarding codepoints-to-string() from wrong input
From: "Andrew Welch" <andrew.j.welch@xxxxxxxxx>
Date: Wed, 20 Dec 2006 15:08:12 +0000
Re:  Safe-guarding codepoints-to-string() from wrong in
On 12/20/06, Abel Braaksma <abel.online@xxxxxxxxx> wrote:
I know that control characters are not allowed and throw an "Invalid XML
character" error.

If you are receiving strings containing literal control characters then they're almost definitely encoded in Windows-1252 - just parse them using that and you'll be ok.

If the string contains control characters as character references,
then its a bit harder because the references get expanded using
unicode codepoints, and not those specified in the Windows-1252
mappings...  So you need to parse/serialize the string to expand the
references (I personally use JTidy with the CharEncoding set to
Configuration.RAW which forces the Tidy to output the bytes instead of
a reference)

Its a pain....

cheers
andrew

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.