[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

genx: canonicalization vs. pretty printing


jdom pretty print
At 7:21 PM -0800 1/21/04, Tim Bray wrote:

>1. Any output encoding other than UTF-8
>2. Optional escaping of illegal characters
>3. Prettyprinting support
>4. Various kinds of error workarounds, and turning errorchecking off
>5. Writing CDATA sections
>6. Writing XML declaration (good idea, but I want the output to be 
>Canonical XML)

I think you're at best at 50% with these goals, maybe less. This is 
certainly not at 80%.

My experience with JDOM, XOM, and other APIs for doing operations 
like XInclude that ultimately reslt in a serialized XML document is 
that users really, really want pretty-printing a lot of the time. If 
the API wont do that, they will ignore it. I wouldn't pretty print by 
default, but I would definitely include options for setting the 
maximum line length and indent string.

Furthermore, I would make canonicalization an option if it's included 
at all. It  imposes a significant performance hit since you have to 
sort the attributes and namespaces. Worse, it prevents full streaming 
since the attributes and namespaces have to be buffered before you 
sort them. But more importantly, probably half the time users don't 
care about canonicalization. The other half the time they do care. To 
be more specific they don't want it. It's too damn ugly and the lines 
are too long to work with. Plus there's no XML declaration, which 
users like.

Similarly, users actively desire non-UTF-8 encodings. UTF-16, 
Latin-1, SJIS, etc. are all much easier for some classes of users to 
process with non-XML aware tools than is UTF-8 on today's systems.

I think an XML output library needs to realize that opening files in 
a plain text editor is still a very important use case a lot of the 
time. Byte-by-byte comparison and digital signatures usually aren't 
(and when they are the digital signature library will canonicalize 
first). Requiring canonical XML and not pretty printing or allowing 
encoding selection does not give users the simple library that does 
what they need. It is not sufficiently full featured to hit the 80-20 
point.

-- 

   Elliotte Rusty Harold
   elharo@m...
   Effective XML (Addison-Wesley, 2003)
   http://www.cafeconleche.org/books/effectivexml
   http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.