[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Unicode normalization in XML 1.1
> The point is that normalization is expensive, and it may be > too expensive to do at all in small systems. Therefore, the > W3C's choice (expressed in the Character Model) is to have > senders normalize, and receivers check for normalization. In > this way documents are normalized once at creation (or > publication) time, rather than every time a document is > received; this conserves net-wide cycles, since checking is > cheaper than normalizing. While this policy makes sense, its translation into rules for software components is unfortunately full of absurdities. The fact that the character model [1] bans text processing software from doing normalization [2] means that senders are going to have a tough job meeting the requirement to normalize the text, because they won't be able to find any text processing software that does the job for them. [1] http://www.w3.org/TR/charmod/ [2] Section 4.4: "A text processing component .... must not normalize suspect text". Michael Kay
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|