|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Character encoding questions
I was struck by the following sentence in the Microsoft XML White Paper: XML supports a range of encodings...subject only to the restriction that an entire document must share the same encoding. My immediate reaction was that that wasn't correct, although the definition of "document" above isn't obvious to me (for example, are external entities part of a document?). However, when checking into the XML April specification, I got in over my head. I am hoping that someone here will help me out of my hole. If my XML document is a simple Unicode text file then I begin it like the following a Byte Order Mark <?XML version="1.0" encoding="ISO-10646-UCS-2"?> ... with the Byte Order Mark being required even though an EncodingDecl is used? (I would have said "yes" until I got to Appendix E "Autodetection of Character Sets," which worries about detecting UCS-2 when there is no Byte Order Mark.) Is the EncodingDecl necessary if the file starts with a Byte Order Mark? Where can I have an EncodingPI? Section 4.3.3 talks about their being "at the beginning of a system entity, before any other character data or markup" but doesn't define "system entity" (perhaps one that has an ExternalID that contains "SYSTEM"?). If my document references an external entity, then I believe that the external entity must start with an EncodingPI (see Appendix E "Autodetection of Character Sets") if it isn't in UTF-8 or start with a Byte Order Mark. If I wanted to take the external entity and, for portability reasons, bundle it into my XML document as an internal entity, what do I do with the external entity's EncodingPI? It doesn't seem to be allowed in the internal entity declaration, somewhat like: <!ENTITY Pub-Status <?XML encoding="ISO-10646-UCS-2"?>"text here"> I presume that the answer is that I cannot convert an external entity into an internal unless the external entity and my XML document have the same encoding. What is the motivation for not allowing a change of encoding within an entity? The mechanism for handling that seems no different than that needed to handle different encodings in external entities, which I think of as being logically a part of the referencing document. xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@i... the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








