|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] FW: UTF-8+names
I wrote: > > Another fact that I think has been overlooked is the following. > > The following fragment of XML (encoded in UTF-8+names but > displayed as if it were encoded in UTF-8) contains exactly 18 > Unicode characters: > > <a>one two<</a> > > because counts as one character and < > counts as 4 characters. > > The UTF-8+names encoding of this fragment of XML occupies 23 > bytes. The UTF-8 encoding occupies 19 bytes. ... and, by the way, the following fragment of XML is different from the one above (although it *looks* the same in this email) and contains 23 Unicode characters instead of 18: <a>one two<</a> The UTF-8 encoding of this fragment of XML occupies 23 bytes. The UTF-8+names encoding is longer than that because the first ampersand must be encoded as the three ASCII bytes & & ; so that the XML entity reference is not mistaken for the pseudo-entity Alessandro
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








