Re: Genx

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

To: xml-dev@l...
Subject: Re: Genx
From: Joe English <jenglish@f...>
Date: Wed, 21 Jan 2004 15:33:17 -0800
In-reply-to: <14F9ED9C-4C5B-11D8-905C-000A95A51C9E@t...>
References: <14F9ED9C-4C5B-11D8-905C-000A95A51C9E@t...> <C85FBC6B-4B14-11D8-905C-000A95A51C9E@t...> <200401211840.i0LIeLo07627@d...> <20040121195737.GH30165@s...> <E40F8B75-4C55-11D8-905C-000A95A51C9E@t...> <20040121213708.GA4414@s...>

Tim Bray wrote:

> jcowan wrote:
> > C and C++ on the Windows platform *are* UTF-16 centric.  If you put
> > a Gothic character into a "..."L string, for example
>
> So you're saying that it would be satisfactory for genx to infer that if
>
>     sizeof(wchar_t) == 2
>
> then the values are UTF16 coded units? -Tim

I'd say that depends on what degree of portability you're
after, and whether or not you use any of the wcs* or mb*
standard library routines.

If you want it to be strictly-conforming C, that's *not* a
safe assumption.  If OTOH you only need it to be portable to
a plurality of relatively modern, not-too-badly-braindamaged
systems, it's probably OK.

More specifically: if sizeof(wchar_t) == 2 and NBBY == 8,
then you can safely assume that a wchar_t can hold a UCS-16
code point.  You should *not* assume that the compiler and C
standard library will interpret them as such.

Nor should you assume that the compiler and C standard library
will interpret multibyte sequences as UTF-8 (many don't).

You should *definitely* not assume that wchar_t's are UTF16 coded
units: any implementation that does so is just plain wrong --
UTF-16 is a variable-width encoding (unless you restrict
it to the BMP, in which case it's the same as UCS-16).

--Joe English

  jenglish@f...

References:
- Re: Genx
  - From: Tim Bray <tbray@t...>
- Genx
  - From: Tim Bray <tbray@t...>
- Re: Genx
  - From: Joe English <jenglish@f...>
- Re: Genx
  - From: jcowan@r...
- Re: Genx
  - From: Tim Bray <tbray@t...>
- Re: Genx
  - From: jcowan@r...

Prev by Date: Re: Genx
Next by Date: RE: Genx
Previous by thread: Re: Genx
Next by thread: Re: Genx
Index(es):
- Date
- Thread

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >