[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] output to iso-8859-1 of non-iso characters, what is r
Hi, XSL 1 question. Just wanted to run something by you all: If I specify my output is iso-8859-1 and I am outputting a character that is not iso-8859-1, for example by putting the value of text node from an UTF-8 document, what is the processor required to do? 1. Fail with error warnings 2. implementation specific, can decide to fail, discard non iso characters in output, provide settings for processor so choose at processing time. 3. remove non iso characters from output, do not fail. and if it is 2 should the default be 1 or 3? I personally go with 3, but according to the spec if it is text: The text output method outputs the result tree by outputting the string-value of every text node in the result tree in document order without any escaping. The media-type attribute is applicable for the text output method. The default value for the media-type attribute is text/plain. The encoding attribute identifies the encoding that the text output method should use to convert sequences of characters to sequences of bytes. The default is system-dependent. If the result tree contains a character that cannot be represented in the encoding that the XSLT processor is using for output, the XSLT processor should signal an error. BUT for XML The encoding attribute specifies the preferred encoding to use for outputting the result tree. XSLT processors are required to respect values of UTF-8 and UTF-16. For other values, if the XSLT processor does not support the specified encoding it may signal an error; if it does not signal an error it should use UTF-8 or UTF-16 instead. The XSLT processor must not use an encoding whose name does not match the EncName production of the XML Recommendation [XML]. If no encoding attribute is specified, then the XSLT processor should use either UTF-8 or UTF-16. It is possible that the result tree will contain a character that cannot be represented in the encoding that the XSLT processor is using for output. In this case, if the character occurs in a context where XML recognizes character references (i.e. in the value of an attribute node or text node), then the character should be output as a character reference; otherwise (for example if the character occurs in the name of an element) the XSLT processor should signal an error. which I take to mean that if I am outputting an XML document with iso-8859-1 encoding and I have a utf-8 character in a text node and I use the value of that text-node to make the value of a text-node in the output then the character should be automatically changed to a character reference. But if I am outputting a text document with iso-8859-1 then the presence of non-iso characters in the output will raise an error. Cheers, Bryan Rasmussen
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|