[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Localisation: Character Encodings & RDBMS, Unicode->UTF-8 wit h Ro u

  • From: Dylan Walsh <Dylan.Walsh@K...>
  • To: xml-dev@x...
  • Date: Mon, 19 Jun 2000 09:40:33 +0100

hotmail unicode
Forwarding, as it is relevent to this thread.

> -----Original Message-----
> From:	Ronald Bourret [SMTP:rpbourret@h...]
> Sent:	Saturday, June 17, 2000 12:35 PM
> To:	mrys@m...; Dylan.Walsh@K...
> Subject:	RE: Localisation: Character Encodings & RDBMS,
> Unicode->UTF-8 wit h Ro und Tripping
> 
> Michael Rys wrote:
> 
> >Most databases provide Unicode support (e.g., nchar). Since UTF-8 is an
> >encoding where the unicode two-byte characters are mapped into a 
> >single-byte
> >character space such that for some characters two or three single-byte
> >characters are used, you of course can easily store UTF-8 as well in an
> >single-character string datatype. However, strlen functions are normally
> >oblivious to the fact that you actually have UTF-8 stored in the later 
> >case,
> >but just from a storage point of view, you should be able to roundtrip
> >either UTF-8 or Unicode.
> 
> Note also that, unless the database knows it is storing UTF-8, any 
> characters that require two bytes to be stored will be unqueriable. For 
> example, suppose the character 'ä' requires two bytes to be store (I don't
> 
> actually know if it does or not) and the database thinks it is storing 
> ASCII. If so, the query
> 
>   SELECT * FROM Employees WHERE Name="Schäfer"
> 
> will fail because the bytes actually stored in the database are:
> 
>   "Sch--fer"
> 
> where -- represents the two bytes needed to store 'ä', which don't match 
> "Schäfer".
> 
> This is obviously not a problem if the data is not used except through
> XML.
> 
> > > Can you convert the various encoding schemes to UTF-8 for storage, and
> > > convert them back on retrieval?
> 
> Yes.
> 
> > > Would such round-tripping require you to
> > > store the name of the original encoding alongside the UTF version?
> 
> It would need to be stored somewhere -- in the database, in the
> application, 
> in a file that shows how XML is mapped to the database, etc.
> 
> -- Ron Bourret
> 
> P.S. Feel free to forward this to xml-dev if you want. I'm not currently a
> 
> member and can't post.
> 
> 
> ________________________________________________________________________
> Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com

***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@x...&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.