[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: RELAXNG Compact Syntax and character escapes
> I'm certainly no Relax expert, but on the face of it is does *NOT* > sound reasonable. In general XML and Unicode processing, one *MUST* > handle characters with code points beyond U+FFFF. They are not > optional. This is true even if your programming language (Java > perhaps?) has inadequate support for them. What was I thinking? Don't code at 2:00 a.m., or at least don't email lists when you can't figure stuff out at 2:00 a.m. I think this is a better effort, all it took was some reading-- but of course comments are still eagerly awaited. // Set the character, but check for surrogates if (escapeChar <= 0xFFFF) { // Output directly readBuffer[i] = (char)escapeChar; } else if (escapeChar <= 0x10FFFF) { escapeChar -= 0x10000; // Greater than 16 bits (max 20), need a surrogate // Output High Surrogate (add top 10 bits to 0xD800) readBuffer[i++] = ((char) (0xD800 | (escapeChar >> 10))); // Output Low Surrogate (add bottom 10 bits to 0xDC00) readBuffer[i] = ((char) (0xDC00 | (escapeChar & 0x03FF))); } else { // The value is too large Error("Character reference is too large for UTF-16", ((int)escapeChar).ToString("X"), null); } All the best, Jeff Rafter
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|