[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: MSXML DOM Special Chars Less Than 32


special chars in xml
> From: Joshua Allen [mailto:joshuaa@m...]
> Sent: Friday, March 22, 2002 11:30 PM
> To: michael.h.kay@n...; Rick Jelliffe; xml-dev@l...
> Subject: RE:  MSXML DOM Special Chars Less Than 32
>
>
> > I don't want to dumb XML down. But we do sometimes need to store data
> (e.g.
> > WebDAV property values) which can potentially contain characters that
> are
> > not permitted in XML. In fact, it's very unlikely that a WebDAV
> property
> > value will contain such a character, but the software still needs to
> allow
> > for the possibility.
>
> Why would someone want to use XML if they need to transmit illegal
> characters?  There are usually two cases -- one is where the illegal
> characters are insignificant, in which case they can be stripped and the
> output is well-formed XML.  The other case is where the illegal
> characters *are* significant, and must be preserved for round-trip.  But
> if someone wants to round-trip characters that are clearly not permitted
> by any XML processor in the world, why use XML?  That's like getting mad
> because a car won't float.

That's a bit like saying that XML should not be used as marshalling
information when arbitrary strings are sent around. So should SOAP and
WebDAV changed?

The problem is that when these protocols were designed, apparently the
different concept of character data wasn't considered.

For instance:

1) What will MS Sharepoint Server do when a property name starts with a
leading digit, and a WebDAV PROPFIND request asking for "all" properties
comes in? (Answer: it sends non-wellformed XML response bodies, breaking
every compliant XML processor / WebDAV client in the world - interesting
enough, Microsoft's own clients "handle" this).

2) What is a WebDAV server supposed to do if it's actually accessing a
backend system it doesn't control entirely, and if a property value contains
control characters other than CR, LF or TAB? Your choices are: a) fail the
request, b) drop the offending characters, c) invent a new marshalling
format that is still compatible with "xs.string".

> > arguments. I guess the C lobby is sufficiently entrenched that we'll
> never
> > allow �, but apart from that I don't really see the need for
> > restrictions.
>
> But that is exactly the point: even if we started again from scratch,
> there exists a subset of characters that will end up being illegal.
> There will also exist a certain population of users who disagree with
> each illegal character choice.  There will additionally be a certain
> population of implementers who disagree with the *permissiveness* of the
> characters, since it makes their lives difficult, and they have to
> handle characters in a way that is unnatural (NEL for Unix people, for
> example).
>
> So my point is that the set of illegal characters will always be an
> arbitrary value-judgment that tries to balance between implementers and
> users.  I do not think it is an objective "there is one right answer"
> situation.

Agreed.

However, ignoring the issue doesn't exactly help either. Many
applications/protocols are stuck with the task of marshalling "arbitrary"
strings as XML (and datatype xs:string), so it would be good if there was an
XML-1.0 compliant, cross-protocol format to do this.


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.