Re: Unicode normalization in XML 1.1

To: xml-dev@l...
Subject: Re: Unicode normalization in XML 1.1
From: Lars Marius Garshol <larsga@g...>
Date: 06 Apr 2003 16:33:35 +0200
In-reply-to: <20030403132804.GI29046@c...>
References: <m3u1dfonr4.fsf@p...><20030403132804.GI29046@c...>
Sender: larsga@p...
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1

Play the video


* Lars Marius Garshol
| 
|  - clearly, documents that are not normalized are still well-formed,
|    so if the application is to have any guarantees here the processor
|    must do normalization before passing on the information,

* John Cowan
| 
| Not so.  A processor in normalization-check mode will report
| non-normalized input, so the application may make up its mind
| whether or not to accept it.

Uh, yes. Obviously what I wrote makes no sense.
 
* Lars Marius Garshol
|
| Wouldn't it be far better if the application could be certain that
| an XML 1.1 processor would provide normalized character data and to
| ignore the whole issue of how the document was encoded? After all,
| isn't the whole purpose of *having* XML parsers to insulate
| applications from worries about the lexical details of documents?
 
* John Cowan
|
| The point is that normalization is expensive, and it may be too
| expensive to do at all in small systems.  Therefore, the W3C's
| choice (expressed in the Character Model) is to have senders
| normalize, and receivers check for normalization.  In this way
| documents are normalized once at creation (or publication) time,
| rather than every time a document is received; this conserves
| net-wide cycles, since checking is cheaper than normalizing.

I can't say I like this, but at least I can see that there is
reasoning behind it and that the reasoning makes sense.

Thanks for clearing this up!

-- 
Lars Marius Garshol, Ontopian         <URL: http://www.ontopia.net >
GSM: +47 98 21 55 50                  <URL: http://www.garshol.priv.no >

References:
- Unicode normalization in XML 1.1
  - From: Lars Marius Garshol <larsga@g...>
- Re: Unicode normalization in XML 1.1
  - From: John Cowan <cowan@m...>

Prev by Date: ANN: OWL Quick Intro
Next by Date: Re: Partial documents in tree-based APIs
Previous by thread: RE: Unicode normalization in XML 1.1
Next by thread: RELAX NG and datatypes make XML generation NP-hard
Index(es):
- Date
- Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >