[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: size?
Kay Michael wrote: > > > -----Original Message----- > > From: Steve Muench [mailto:SMUENCH@xxxxxxxxxxxxx] > > It turns > > out that the notion of the "length" of a string is > > naturally and conveniently defined if you restrict > > yourself to single-byte character sets, but for multibyte > > character sets the notion of "length" is less well-defined. > > The number of characters in a string is perfectly well-defined in XML. The XML spec says "At user option, processors may normalize such characters to some canonical form." Normalization can change the number of characters in a string (by composing or decomposing characters). Another problem is with non-BMP characters (surrogate pairs). In XML these are treated as a single character, but the DOM counts them as two characters. > It > might not be exactly the definition that an expert in Ethiopian or > Glagolitic might like, but it would be good enough for the rest of us. It's more a matter of putting in a definition that speakers of many non-English languages would find counter to their established cultural conventions. Imagine a spec that counted the letters "i" and "j" as two characters and every other English character as one character. James XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|