|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML-enabled databases, XQuery APIs
One point I would like to make is from the python link (http://www.oreillynet.com/pub/wlg/6291) where mention is made to the assumption of parsing 8-bit text documents when Unicode docs may be the norm in the future. Unicode encodings are already the default encodings in windows and more recent linux distributions, so that would be now, not "in the future". Unicode is (according to my understanding) an 8-bit escaping system. That is if the character is extended, it is written into a second, third and then consecutive bytes if required. No, That is (more or less) a description of UTF-8. Unicode itself has nothing to do with bytes or encodings, it is a mapping of a set of characters (with associated names and other properties) to numbers in the range hex 1 to 10FFFF. > So to do really *fast* unicode stuff, ideally, the in-memory view wouldn't > store the characters in 8-bit, but just as 32-bit (4 byte) or 64-bit (8 byte) > strings. That would be UCS4 encoding (otherwise known as utf-32) http://www.google.co.uk/search?q=UCS4 utf-16 is also common, probably more so than utf-8 (java uses utf-16 by default as does msxml). David ________________________________________________________________________ This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








