[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Dangers of Copying Text into an XML Document
Hi Folks, I am compiling a list of well-formedness problems that may arise from copying text from one document and pasting it into an XML document. For example, consider this XML document: <?xml version="1.0" encoding="UTF-8"?> <Document> <Para id="...">...</Para> </Document> Suppose that text is copied from a document and pasted into the XML document, either as the content of the <Para> element or as the value of the id attribute. Here is my current list of problems: 1. The text may contain these reserved characters: {<, >, ', ", &}. These characters may introduce syntax errors into the XML document and may need to be escaped. 2. The editor that was used to create the text may use a different encoding than the XML document's encoding. A binary string that represents a character in one encoding may represent a different character in another encoding. Consequently, if the text was created in an editor that uses a different encoding than the XML document then the characters that result from pasting the text into the XML document may not be the same. Example: Word uses Windows-1252 encoding. The hex value for the left curly (a.k.a. smart) quote is x93. In UTF-8 encoding the hex value for the left curly quote is x201C. In UTF-8 the hex value x93 corresponds to a control character. Copying a left curly quote from a Word document and pasting it into a UTF-8 XML document may result in the XML document receiving a control character rather than a left curly quote. Can you think of other problems that may result from copying text from one document and pasting it into an XML document? /Roger
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|