[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] An element's value is an invalid Unicode string ... how can it bewell-fo
Hi Folks, Consider this Spanish name: Martiñez Instead of using the ñ character, one can use the (base) "n" character followed by a combining tilde (hex 303) character. So that Spanish name can be equivalently expressed as: Martiñez Here is an XML document that uses the latter form: <?xml version="1.0" encoding="utf-8"?> <Name>Martiñez</Name> I wrote a stylesheet that uses the substring() function to extract the combining tilde character and onward: <xsl:template match="/"> <Result> <xsl:value-of select="substring(Name, 7)" /> </Result> </xsl:template> The output is: <?xml version="1.0" encoding="UTF-8"?> <Result>Þez</Result> I checked it for well-formedness and the XML Parser says it is well-formed. According to the book, Fonts & Encodings (p. 61, first paragraph): ... we select a substring that begins with a combining character, this new string will not be a valid string in Unicode. The value of the <Result> element is not a valid Unicode string, so how can it be a well-formed XML document? /Roger
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|