[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: @xml-base in subtrees included (a) via entity expansion, a

  • From: Michael Kay <mike@saxonica.com>
  • To: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
  • Date: Thu, 16 Nov 2017 08:42:40 +0000

Re:  @xml-base in subtrees included (a) via entity expansion
> 
Thanks for studying the problem.

> I am not entirely certain that I understand the problem.  But I infer from your posing the question that:
> 
>  (1) Saxon and not your upstream XML processor is taking responsibility for calculating the base URIs elements.
>  (2) Your upstream processor, and not Saxon, is responsible for both XInclude processing and general entity expansion.
>  (3) Your upstream processor is, as required by the spec, leaving xml:base attributes in the top-level included element items of an inclusion.  And for reasons not at all clear ot me, it is sometimes using a relative reference in the xml:base attribute, and not an absolute reference.
> 
> Are those inferences correct?

Yes. (On (3), I think that there are some good usability reasons for putting a relative reference in the xml:base attribute - it makes the document relocatable as part of a complex of interlinked documents)

> 
> If any of these fails to hold, I don’t see how any difficulty arises (which means you may need to educate me).
> 
> 
>> When an external entity is expanded, and the entity in question contains an element with an xml-base attribute, the value of the @xml:base attribute is supposed to be resolved against the base URI of the external entity itself (not against the base URI of the element into which the entity's expansion is grafted).
> 
> Well, yes in some cases.  The value of any relative reference, whether in an @xml:base attribute or elsewhere, is to be resolved with respect to the relevant base URI.
> That will be the base URI of the external entity itself if but only if the base URI has not been reset in some way (e.g. by an xml:base attribute on an ancestor).

Yes, that is the case we are interested in.
> 
>> But when xi:include is processed, the xi:include processor injects an @xml:base attribute which is intended to be resolved against the base URI of the "include parent" (that is, the parent of the xi:include element).
> 
> And whose effective value should be the URI of the included material.  

Let me try giving an example. Consider first a single-entity document (A) with base URI http://example.com/doc.xml:

<out>
  <in xml:base="dir/in.xml"/>
</out>

then the base URI of <in> is http://example.com/dir/in.xml.

If we now take this document (B) at the same location:

<!DOCTYPE out [
<!ENTITY e SYSTEM "dir/in.xml">
]>
<out>&e;</orders>

where the external entity is

<in xml:base="dir/in.xml"/>

then the document after entity expansion is

<out>
  <in xml:base="dir/in.xml"/>
</out>

but the base URI of <in> is now http://example.com/dir/dir/in.xml

If we now take this document (C) at the same location:

<out>
  <xi:include href="dir/in.xml"/>
</out>

where the included document is 

<in/>

then the expanded document (delivered by Apache Xerces) is

<out>
  <in xml:base="dir/in.xml"/>
</out>

and the base URI of <in> is http://example.com/dir/in.xml 

My challenge is to distinguish these three cases, where the surface structure (the values of elements and attributes) is in all cases the same, but the base URIs are different. In particular the XPath expression base-uri(//in) must deliver the correct answer in all three cases.

I currently distinguish (A) and (B) by detecting that the location information supplied by the SAX parser for the <in> element has a different systemID from the location of the <out> element. But this heuristic is giving me the wrong answer for case (C).

It does occur to me that there is one way I could detect the difference between (B) and (C): presumably the SAX parser will call LexicalHandler.startEntity() and LexicalHandler.endEntity() for case (B), but not for case (C). Using that information will be messy (it needs an extra bit somewhere in the XDM model representation, and bits are in short supply) but it may be do-able.

> 
> Can you tell your upstream XInclude processor to use absolute URIs in all injected xml:base attributes?

AFAIK, No.
> 
> Can you ask it to provide the extension property named “include history”?

AFAIK, No.

Michael Kay
Saxonica


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.