[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Unicode normalization in XML 1.1
* Lars Marius Garshol | | - clearly, documents that are not normalized are still well-formed, | so if the application is to have any guarantees here the processor | must do normalization before passing on the information, * John Cowan | | Not so. A processor in normalization-check mode will report | non-normalized input, so the application may make up its mind | whether or not to accept it. Uh, yes. Obviously what I wrote makes no sense. * Lars Marius Garshol | | Wouldn't it be far better if the application could be certain that | an XML 1.1 processor would provide normalized character data and to | ignore the whole issue of how the document was encoded? After all, | isn't the whole purpose of *having* XML parsers to insulate | applications from worries about the lexical details of documents? * John Cowan | | The point is that normalization is expensive, and it may be too | expensive to do at all in small systems. Therefore, the W3C's | choice (expressed in the Character Model) is to have senders | normalize, and receivers check for normalization. In this way | documents are normalized once at creation (or publication) time, | rather than every time a document is received; this conserves | net-wide cycles, since checking is cheaper than normalizing. I can't say I like this, but at least I can see that there is reasoning behind it and that the reasoning makes sense. Thanks for clearing this up! -- Lars Marius Garshol, Ontopian <URL: http://www.ontopia.net > GSM: +47 98 21 55 50 <URL: http://www.garshol.priv.no >
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|