[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Namespaces, Xml Schema Whitespace normalization, xs:anyURI

  • To: <lists@j...>,<xml-dev@l...>
  • Subject: RE: Namespaces, Xml Schema Whitespace normalization, xs:anyURI, and URILiterals in XPath 2.0
  • From: "Michael Kay" <mike@s...>
  • Date: Mon, 27 Mar 2006 21:20:12 +0100
  • In-reply-to: <442838A4.9070403@j...>
  • Thread-index: AcZR0qM44WSsPewgRzKlHSEG0FYyjAABeAxQ

type xs anyuri

I think that you are right that the specs are mutually inconsistent in this
area. This started a long time ago, with the base Namespaces spec saying
that it's not an error if the "namespace name" isn't a URI, and the Infoset
saying "it might not be an error, but you don't get an infoset". Much of the
discussion below hinges on the question of whether a namespace "URI" can or
cannot contain a space, and that depends on whether you take the Namespaces
Rec or the Infoset as your starting point. Comments below.

> In XPath 2.0 (CR) many of the namespace properties

What exactly do you mean by "namespace properties"?

 are defined as 
> xs:anyURI with requirements on whitespace normalization 
> defined in the 
> XML Schema spec. Now, I am not sure which rules are being referred to 
> but I can only guess that they are the whitespace 
> normalization rules in 
> the structures spec [2] because there are no rules for 
> normalization in 
> the updated wording in the errata for xs:anyURI [3].

Section 4.3.6 of Schema Part 2 is also relevant. This states that the
whitespace normalization applied to values of type xs:anyURI is collapse.


 Michael Kay's 
> corrective wording for URILiteral is:
> 
> "The URILiteral is subjected to whitespace normalization as 
> defined for 
> the xs:anyURI type in [XML Schema]: this means that leading 
> and trailing 
> whitespace is removed, and any other sequence of whitespace 
> characters 
> is replaced by a single space (#x20) character. Whitespace 
> normalization 
> is done after the expansion of CharRefs, so writing a newline 
> (say) as 
> &#xA; does not prevent its being normalized to a space character." [4]

Note that this rule applies to URIs written as literals in the text of a
query. Many of these are indeed namespace URIs.

> 
> Now this leads me to my larger question: is whitespace normalization 
> allowed for namespace declarations? If not, does this ruin their 
> comparability? In Namespaces in XML 1.1 (I am using 1.1 because it 
> contains better wording for what was already understood in 1.0), it 
> states that namespaces must be compared lexically and that the 
> comparison should take place after attribute normalization 
> (so CharRefs 
> are expanded) [5]. Because of this, you may end up with a 
> single-normalized namespace IRI/URI and a double-normalized namespace 
> property in XPath 2.0. Consider the namespace name:
> 
>    xmlns:foo="http://www.example.com/Example with two &#x20;spaces"
> 
> The namespace name will be viewed as (after normalization):
> 
>    http://www.example.com/Example with two  spaces
> 
> While the doubly normalized property value will be (after XML Schema 
> whitespace normalization):
> 
>    http://www.example.com/Example with two spaces
> 
> If this is true then lexical comparison will fail. Is this accurate?

You need to consider the context in which the comparison takes place. If
it's comparison of a name appearing in a path expression, such as
child::foo:bar, then it's true that the namespace declaration

xmlns:foo="http://www.example.com/Example with two &#x20;spaces"

in XQuery binds foo to a different namespace than the declaration

xmlns:foo="http://www.example.com/Example with two &#x20;spaces"

in XML or XSLT. This is an example of the general observation that character
references in XQuery do not work quite the same way as they do in XML. It's
also true that there's no way in XQuery of binding a prefix to a namespace
containing two consecutive spaces. Since there's no way of building an
Infoset containing a namespace whose name contains even a single space, I
don't think anyone will shed many tears over this.

Your subject heading refers to XPath rather than XQuery. In XPath,
namespaces are bound externally, so there are no such constraints imposed by
the language, though there may be constraints imposed by the environment. 
> 
> [1] http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/
>    * Note, XPath 2.0 refers to the REC first edition not the SE

I'm surprised - this must be an oversight.

Michael Kay
http://www.saxonica.com/



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.