[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Fw: Encodings and how they're specified
Forgot to add xml-dev ... ----- Forwarded by Hermann Stamm-Wilbrandt/Germany/IBM on 07/05/2011 06:32 PM ----- From: Hermann Stamm-Wilbrandt/Germany/IBM To: David Carlisle <davidc@nag.co.uk> Date: 07/05/2011 06:03 PM Subject: Re: Encodings and how they're specified > b) in the absence of external http header information, using the bom and > or the first few bytes encoding "<?xml" it can figure out how (most > likely) the ascii range of characters are encoded and that's good enough > to be able to read the encoding declaration and fix up the encoding once > you have read that. I would agree "normally", but ebcdic encoding is different. So this is how I create an "ebcdic-de" encoded XML file (uconv is like iconv, but part or ICU library distribution) (ebcdic.xml.txt is Non-XML as encoding declaration and actual encoding differ) So an XML processor/parser should be able to deal with ebcdic.xml and correctly determine its "ebcdic-de" encoding, right? $ cat ebcdic.xml.txt <?xml version="1.0" encoding="ebcdic-de"?> <ebcdic>123</ebcdic> $ $ uconv -f utf-8 -t ebcdic-de ebcdic.xml.txt >ebcdic.xml $ $ od -Ax -tx1 ebcdic.xml 000000 4c 6f a7 94 93 40 a5 85 99 a2 89 96 95 7e 7f f1 000010 4b f0 7f 40 85 95 83 96 84 89 95 87 7e 7f 85 82 000020 83 84 89 83 60 84 85 7f 6f 6e 25 4c 85 82 83 84 000030 89 83 6e f1 f2 f3 4c 61 85 82 83 84 89 83 6e 25 000040 $ $ cat ebcdic.xml; echo Lo���@�������~�K�@��������~������`��on%L������n���La������n% $ Mit besten Gruessen / Best wishes, Hermann Stamm-Wilbrandt Developer, XML Compiler, L3 Fixpack team lead WebSphere DataPower SOA Appliances https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/ ---------------------------------------------------------------------- IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschaeftsfuehrung: Dirk Wittkopp Sitz der Gesellschaft: Boeblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 From: David Carlisle <davidc@nag.co.uk> To: Joe Fawcett <joefawcett@hotmail.com> Cc: xml-dev@lists.xml.org Date: 07/05/2011 04:40 PM Subject: Re: Encodings and how they're specified On 05/07/2011 15:22, Joe Fawcett wrote: > , I still don't see how it manages to read the encoding mentioned in > the XML declaration with only the BOM available? you are verging off list, but a) lt's specified here http://www.w3.org/TR/2008/REC-xml-20081126/#sec-guessing and b) in the absence of external http header information, using the bom and or the first few bytes encoding "<?xml" it can figure out how (most likely) the ascii range of characters are encoded and that's good enough to be able to read the encoding declaration and fix up the encoding once you have read that. David ________________________________________________________________________ The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. ________________________________________________________________________ _______________________________________________________________________ XML-DEV is a publicly archived, unmoderated list hosted by OASIS to support XML implementation and development. To minimize spam in the archives, you must subscribe before posting. [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ Or unsubscribe: xml-dev-unsubscribe@lists.xml.org subscribe: xml-dev-subscribe@lists.xml.org List archive: http://lists.xml.org/archives/xml-dev/ List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|