[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Character entities and document encoding
Please see Tim Bray's excellent treatise on this topic [1]. Kind Regards, Joe Chiusano Booz | Allen | Hamilton Strategy and Technology Consultants to the World [1] http://www.tbray.org/ongoing/When/200x/2003/04/26/UTF Andy Greener wrote: > > I'd appreciate some advice on the following issues... > > Being from the UK, we have a requirement to convey the UK pound-sterling > character in XML documents (and validate those documents of course). > The Unicode decimal value of pound sterling is 163 (0xA3), but of course > the UTF-8 encoding is 0xC2A3. > > I'm ok with the fact that a UTF-8 encoded instance doc can contain the > above two byte values directly (i.e. 0xC2 and 0xA3), but I'm getting > conflicting opinion as to whether replacing those two bytes with the > character entity £ is equivalent or not - I think not, so long as > the document is UTF-8 encoded, though it would be correct to do this > if the encoding were "ISO-8859-1", as would inserting the actual pound > character (ie the 8 bit value equivalent to 0xA3). However, I'm happy to > be corrected. > > I guess the fundamental question is: how are character entities > interpreted in relation to the document encoding (i.e. what's the > order of evaluation)? If that's not the fundamental question then > I'm missing something :-)) > > A supplementary question: if I want to validate text containing pound > sterling characters, and my Schemas are UTF-8 encoded, what do I put in > the pattern facet: £ or the two character UTF-8 encoding? And what > will your average regular expression evaluator make of the latter? > > Thanks in advance for any help > -- > > Andy Greener Mob: +44 7836 331933 > GID Ltd, Reading, UK Tel: +44 118 956 1248 > andy@g... Fax: +44 118 958 9005 > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org <http://www.xml.org>, an > initiative of OASIS <http://www.oasis-open.org> > > The list archives are at http://lists.xml.org/archives/xml-dev/ > > To subscribe or unsubscribe from this list use the subscription > manager: <http://www.oasis-open.org/mlmanage/index.php>
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|