[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XML vs the Dreaded Whitespace

  • From: Tim Bray <tbray@t...>
  • To: xml-dev@i...
  • Date: Sat, 13 Dec 1997 14:57:07 -0800

xml prologue white space
At 03:00 AM 11/12/97 -0500, Chris Smith wrote:
>Part of this work requires that these documents carry document
>authentication information. This, in turn, requires that some regions
>of an XML document must be transported *exactly*, and must be received
>and checked identically so that the message authentication actually
>works. That fact that we are considering the idea of including email
>as a transport mechanism doesn't help matters.

So your proposal is: 
(1) transcode into UTF-16 if necessary
(2) digitally sign what you get after (1).

I think this is a sensible way to go.  Obviously, there are
anomalies; 

<a foo='1' bar="2"/> 
will not be the same as
<a
 foo="1"
 bar='2'
></a>

which is surprising, but trying to find solutions may well not be
cost-effective.

You *might* want to consider losing the prologue and start checking
just at the root element.

You *might* want to consider normalizing namespace prefixes.

You *might* want to normalize whitespace in markup.

You *might*, etc etc etc etc; unless you are willing to commit to
a full grove/propert-set model a la SGML's extended facilities, you
may well be better off signing the instance as it sits.

In particular, I think there are lots of things that would be easier
and less trouble-prone to work around than line-breaking, which is well
known to be highly error-prone.  For example, in the line-break HERE->    
how many space characters that you can't see follow the ">"?

There might be a useful halfway point as follows; run it through an
XML processor and sign just the combination of element type, attribute
name-value pairs, and textual content that the processor emits; this 
allows you to finesse a lot of quoting/white-space/line-end issues; 
also it allows authors to use tricks like default attributes and 
internal entities that don't "really" change the content.

On the other hand, I'd say that off the top, just digitally signing the
UTF-i-fied characters as they sit is a reasonable way to go. -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.