[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: UTF-8+names
Another fact that I think has been overlooked is the following. The following fragment of XML (encoded in UTF-8+names but displayed as if it were encoded in UTF-8) contains exactly 18 Unicode characters: <a>one two<</a> because counts as one character and < counts as 4 characters. The UTF-8+names encoding of this fragment of XML occupies 23 bytes. The UTF-8 encoding occupies 19 bytes. Now, while is easy to remember as being one of the magic pseudo-entities, how about any of those 2000+ pseudo-entities listed in the draft? Can anybody determine, without doing a lookup, how many Unicode characters are there in <a>one&column-separator;two<</a> ? Is the general opinion here that this kind of confusion is not important (say, not important to software vendors and not important to users of XML technologies)? Alessandro
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|