[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Fw: Encodings and how they're specified

  • From: Hermann Stamm-Wilbrandt <STAMMW@de.ibm.com>
  • To: xml-dev@lists.xml.org
  • Date: Tue, 5 Jul 2011 18:33:09 +0200

Fw:  Encodings and how they're specified
Forgot to add xml-dev ...

----- Forwarded by Hermann Stamm-Wilbrandt/Germany/IBM on 07/05/2011 06:32 
PM -----

From:   Hermann Stamm-Wilbrandt/Germany/IBM
To:     David Carlisle <davidc@nag.co.uk>
Date:   07/05/2011 06:03 PM
Subject:        Re:  Encodings and how they're specified


> b) in the absence of external http header information, using the bom and 

> or the first few bytes encoding "<?xml" it can figure out how (most 
> likely) the ascii range of characters are encoded and that's good enough 

> to be able to read the encoding declaration and fix up the encoding once 

> you have read that.

I would agree "normally", but ebcdic encoding is different.

So this is how I create an "ebcdic-de" encoded XML file 
(uconv is like iconv, but part or ICU library distribution)
(ebcdic.xml.txt is Non-XML as encoding declaration and actual encoding 
differ)

So an XML processor/parser should be able to deal with ebcdic.xml and 
correctly 
determine its "ebcdic-de" encoding, right?

$ cat ebcdic.xml.txt 
<?xml version="1.0" encoding="ebcdic-de"?>
<ebcdic>123</ebcdic>
$ 
$ uconv -f utf-8 -t ebcdic-de ebcdic.xml.txt >ebcdic.xml
$ 
$ od -Ax -tx1 ebcdic.xml
000000 4c 6f a7 94 93 40 a5 85 99 a2 89 96 95 7e 7f f1
000010 4b f0 7f 40 85 95 83 96 84 89 95 87 7e 7f 85 82
000020 83 84 89 83 60 84 85 7f 6f 6e 25 4c 85 82 83 84
000030 89 83 6e f1 f2 f3 4c 61 85 82 83 84 89 83 6e 25
000040
$ 
$ cat ebcdic.xml; echo
Lo���@�������~�K�@��������~������`��on%L������n���La������n%
$ 


Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Developer, XML Compiler, L3
Fixpack team lead
WebSphere DataPower SOA Appliances
https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294 



From:   David Carlisle <davidc@nag.co.uk>
To:     Joe Fawcett <joefawcett@hotmail.com>
Cc:     xml-dev@lists.xml.org
Date:   07/05/2011 04:40 PM
Subject:        Re:  Encodings and how they're specified



On 05/07/2011 15:22, Joe Fawcett wrote:
> , I still don't see how it manages to read the encoding mentioned in
> the XML declaration with only the BOM available?

you are verging off list, but

a) lt's specified here
http://www.w3.org/TR/2008/REC-xml-20081126/#sec-guessing

and

b) in the absence of external http header information, using the bom and 
or the first few bytes encoding "<?xml" it can figure out how (most 
likely) the ascii range of characters are encoded and that's good enough 
to be able to read the encoding declaration and fix up the encoding once 
you have read that.

David


________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs. 
________________________________________________________________________

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php





[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.