[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Localisation: Character Encodings & RDBMS, Unicode->UTF-8 wit h Ro u

  • From: Dylan Walsh <Dylan.Walsh@K...>
  • To: xml-dev@x...
  • Date: Mon, 19 Jun 2000 14:22:43 +0100

ro character
The irony here is just too much. :-)

I have appended the content of Matts message below.

> -----Original Message-----
> From:	Matt Sergeant [SMTP:matt@s...]
> Sent:	Monday, June 19, 2000 11:28 AM
> To:	Dylan Walsh
> Cc:	xml-dev@x...
> Subject:	RE: Localisation: Character Encodings & RDBMS,
> Unicode->UTF-8 wit	 h Ro und Tripping
> 
> This message uses a character set that is not supported by the Internet
> Service.  To view the original message content,  open the attached
> message. If the text doesn't display correctly, save the attachment to
> disk, and then open it using a viewer that can display the original
> character set. 
> 
> << File: message.txt >> 
> 
On Mon, 19 Jun 2000, Dylan Walsh wrote:

> Forwarding, as it is relevent to this thread.
>=20
> > -----Original Message-----
> > From:	Ronald Bourret [SMTP:rpbourret@h...]
> > Sent:	Saturday, June 17, 2000 12:35 PM
> > To:	mrys@m...; Dylan.Walsh@K...
> > Subject:	RE: Localisation: Character Encodings & RDBMS,
> > Unicode->UTF-8 wit h Ro und Tripping
> >=20
> > Michael Rys wrote:
> >=20
> > >Most databases provide Unicode support (e.g., nchar). Since UTF-8 is=
 an
> > >encoding where the unicode two-byte characters are mapped into a=20
> > >single-byte
> > >character space such that for some characters two or three single-by=
te
> > >characters are used, you of course can easily store UTF-8 as well in=
 an
> > >single-character string datatype. However, strlen functions are norm=
ally
> > >oblivious to the fact that you actually have UTF-8 stored in the lat=
er=20
> > >case,
> > >but just from a storage point of view, you should be able to roundtr=
ip
> > >either UTF-8 or Unicode.
> >=20
> > Note also that, unless the database knows it is storing UTF-8, any=20
> > characters that require two bytes to be stored will be unqueriable. F=
or=20
> > example, suppose the character '=E4' requires two bytes to be store (=
I don't
> >=20
> > actually know if it does or not) and the database thinks it is storin=
g=20
> > ASCII. If so, the query
> >=20
> >   SELECT * FROM Employees WHERE Name=3D"Sch=E4fer"
> >=20
> > will fail because the bytes actually stored in the database are:
> >=20
> >   "Sch--fer"
> >=20
> > where -- represents the two bytes needed to store '=E4', which don't =
match=20
> > "Sch=E4fer".

They do if the query is also in UTF-8, and therefore you're requesting:

SELECT * FROM Employees WHERE Name=3D"Sch--fer"

(using your syntax).

--=20
<Matt/>

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org | AxKit: http://axkit.org

***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@x...&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.