XML Editor
Sign up for a WebBoard account Sign Up Keyword Search Search More Options... Options
Chat Rooms Chat Help Help News News Log in to WebBoard Log in Not Logged in
Show tree view Topic
Topic Page 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Go to previous topicPrev TopicGo to next topicNext Topic
Postnext
Brightman MkhwanaziSubject: Compressed content in CADATA
Author: Brightman Mkhwanazi
Date: 11 Nov 2009 12:12 AM
I have an xml document that has CDATA with compressed content. I am getting this error when opening the file :

Invalid character (Unicode: 0x1F)

I have attached the xml document.

I think the issue is with character encoding for compressed content.


UnknownfromMQ-20091104-160249-780.xml
MQ Series xml document

Postnext
Tony LavinioSubject: Compressed content in CADATA
Author: Tony Lavinio
Date: 11 Nov 2009 07:57 AM
That's not valid XML.

If the file had an XML header that stated version 1.1, then you
could hold the characters with Unicode values < 32 (except null).
And relatively few parsers handle XML 1.1; that standard was a
non-starter.

But CDATA does not mean "can hold binary content". It's just a
wrapper that lets you avoid having to escape certain characters
such as < and &. The "C" means "character".

Typically when one needs to write binary content, it is exported as
base-64, as in the examples at http://www.stylusstudio.com/binary_xml.html

Postnext
Brightman MkhwanaziSubject: Compressed content in CDATA
Author: Brightman Mkhwanazi
Date: 11 Nov 2009 08:51 AM
Hi Tony

I am getting the file as it is. Is the no character encoding for compressed binary data?

Regards
Brightman

Posttop
Tony LavinioSubject: Compressed content in CDATA
Author: Tony Lavinio
Date: 11 Nov 2009 05:21 PM
There is none. XML is a text format.

In fact, to be considered a conforming XML parser, it *must*
reject binary data.

The definition of CDATA is here, and notice it refers to "Char"s
http://www.w3.org/TR/2008/REC-xml-20081126/#sec-cdata-sect

Clicking on the "Char" link gets us to a list of valid Unicode values,
[2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */

The only valid values with a Unicode value >=0 and <=31 are TAB,
CR and LF.

 
Topic Page 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Go to previous topicPrev TopicGo to next topicNext Topic
Download A Free Trial of Stylus Studio 6 XML Professional Edition Today! Powered by Stylus Studio, the world's leading XML IDE for XML, XSLT, XQuery, XML Schema, DTD, XPath, WSDL, XHTML, SQL/XML, and XML Mapping!  
go

Log In Options

Site Map | Privacy Policy | Terms of Use | Trademarks
Stylus Scoop XML Newsletter:
W3C Member
Stylus Studio® and DataDirect XQuery ™are from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2016 All Rights Reserved.