[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: 0x19 is not a legal XML character

Subject: Re: 0x19 is not a legal XML character
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Thu, 28 Jun 2007 12:46:10 +0200
Re:  0x19 is not a legal XML character
Andrew Welch wrote:
On 6/28/07, Abel Braaksma <abel.online@xxxxxxxxx> wrote:

this may work and will remove all offending U+0019 chars.

The "offending" u+0019 characters could well be good content that's being written/read in the wrong encoding.

True, but if I remember correctly, then all ISO-646 characters (the ancient ASCII ones, before 0x80) are written as is in UTF-8, all ISO-8859-x, CPxxx windows/dos encodings, TIS-620, Shift-JIS, GB2312 etc. The only notable exceptions are, I believe, the IBM EBCDIC encodings (but IBM500 is most often used, which has the End Of Medium right at 0x19 as well). None of these encodings, not even the EBCDIC ones, use the 0x19 for a diacritic.


Just trying to state that: I think it is very unlikely that encoding alone (read or write) will be the culprit here (which is often a culprit though for higher characters).

Of course, it can be valid content, in which case the XML documents should be opened as XML 1.1 documents.


Simply stripping them out probably isn't the best approach - you need to work out why they're there, what put them there and then fix that. Patching it up afterwards is never a good idea.

agreed, just wanted to show how it can be done in XSLT, if you (the OP) felt a need for it.



Imagine explaining your process to someone else in a years time - "this step is where we remove the u+0019 characters".

:D :D
Good design starts at the sources.

cheers,
Abel

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2011 All Rights Reserved.