|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Genx
On Jan 21, 2004, at 11:57 AM, jcowan@r... wrote: >> The 'codePoint' typedef may be problematic: >> >> // Unicode code points (4-byte int on most systems) >> typedef wchar_t codePoint; >> >> ... > I have argued privately that wchar_t is in fact the Right Thing here > despite its variability in size (UTF-32 on Unix platforms, UTF-16 on > Windows), because it makes genx compatible with both standardized and > non-standardized facilities, most especially "..."L strings. Some > conditional logic will be needed to interpret the input as UTF-16 or > UTF-32, which can be based on sizeof(wchar_t). Hypothetical platforms > where sizeof(wchar_t) == 1 can be neglected. Almost. How about we leave it as wchar_t, but *not* UTF-16, so a value that's in a surrogate block is an error. Then we change the name from codePoint (which could be interpreted as meaning "UTF-16 Code Point" to something more explicit like numericValueCorrespondingToAUnicodeCharacterAsInUPlusFourHexDigitsIsThat Clear John Cowan has suggested that "codeUnit" might be a good name, I'd be inclined to "uniChar", any other ideas? If someone wants to put a generic UTF-16 processor on top of genx, that would be fine. I don't see the demand for supporting it at the input end of genx because the UTF-16 centric languages like Java and C# have decent xml-writing software already. -Tim
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








