[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: How to specify a Processing Instruction? (better: how to cont rolenc

  • From: "Arnold, Curt" <Curt.Arnold@h...>
  • To: "'xml-dev@l...'" <xml-dev@l...>
  • Date: Wed, 29 Aug 2001 12:06:24 -0600

xml processing instruction encoding
> Well, this is not what I read.  I read that the encoding part 
> of the processing instruction is something that will be used 
> eventually by the server I'm sending it to.  

The XML declaration is not a Processing Instruction, it only resembles a Processing Instruction.

> Right now, I'm 
> just creating the document from scratch so there really is 
> "no" encoding.  I want to set the encoding so that the 
> eventually server will be able to understand the XML 
> document....it expects ISO-8859-1.  This seems a bit like a 
> chicken and the egg....


XML processors are required by the spec to support UTF-8 and UTF-16, support for all other encodings (including ISO-8859-1) is optional.  If your "server" only recognizes ISO-8859-1, then it is non
conformant with the XML 1.0 specification.  Forcing the encoding to be ISO-8859-1 will cause the serialization to fail if there are any characters with Unicode code points > 255 since those cannot be
represented by ISO-8859-1.

Setting the encoding (if possible) is done as the DOM tree is serialized (that is written out a string of bytes).

> I basically want to start with an xml document that has the 
> basic "framework" and then I want to add and change some of 
> the nodes so that they have want the server expects and then 
> I want to send this off.  If I try to read the document in 
> completely blank as only: <?xml version="1.0" 
> encoding="ISO-8859-1"?> It still gives me an error saying it 
> can't read it in.  Of course, at that point, its not only not 
> encoded, there's nothing there at all.

It is not a XML document since an XML document requires one and only one document element.

<?xml version="1.0" encoding="ISO-8859-1"?>
<foo/>

Is an XML document

> 
> This whole issue of how encoding in XML works is the weakest 
> thing.....I went to Borders books yesterday and I looked at 
> every single book on the shelf - a total of more than 
> 40-50....it took me hours and there wasn't more than a page 
> on encoding in any one book.  
> And, even then, it simply 
> explained what encoding was and nothing practical.  How do 
> you create a document from scratch with a particular 
> encoding?  How do you change the encoding of an existing 
> document?  I can't really find any decent documentation on 
> this anywhere.  To top it all off its the heart and soul of 
> how XML works....without it, XML is nothing but a dream.  I 
> could be able to read in an XML document and then there 
> should be a basic MSXML method that will allow me to convert 
> the document or a single node from one encoding style to 
> another....

Again encoding is a property of an XML document when written out as a stream of bytes, it has no meaning while in a DOM tree.  MSXML's unusual use of a ProcessingInstruction node to represent the XML
declaration only describes the former state of the document, at one time it was encoding using whatever.

> it doesn't exist....how can that not exist?  Is 
> there some other method in VB that lets me change from 
> encoding to another?
> 
> I'm sure the problem is that I really don't understand 
> something very basic but I admit it --- it has to be that 
> way, otherwise, there are some major fundamental operations 
> that seem to be critically missing on how to manipulate XML 
> documents for the different encoding styles of the world.

The XML recommendation addressed this by basing XML on Unicode and stating the only required encodings are UTF-8 and UTF-16.  Use of any other encodings is allowed but not required, so if you want
your documents to universally readable, you will encode them in either UTF-8 or UTF-16.

The DOM 3 effort is formulating a specification for DOM saving (http://www.w3.org/TR/DOM-Level-3-ASLS/), in its current draft, it does allow you to specify the encoding to be used, but as an property
of the OutputStream not as anything in the document.

MSXML appears to attempt to preserve the encoding used on load, but will use UTF-8 if the tree was built from scratch.


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.