[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Re: A question about the expressive power and limi

Subject: Re: Re: A question about the expressive power and limitations of XPath 2.0
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Sun, 13 Jan 2002 08:26:33 +0000
bionicle gif
Hi Dimitre,

> I thought that you could achieve concatenation by a simple:
>  <xsl:value-of select="$sequence" separator="''"/>
> or am I wrong? (of course, this is not "pure XPath")

No, you're right (well, separator has to just be separator="" - it's
an attribute value template, not an expression, but yes.)

OK, let me think... a higher-order distinct function is an example.
You have structured identifiers of the form "group.subgroup" and you
want to return a unique set of nodes based on the "group" part of the
identifier (note that Mike said they were discussion how to support
this already, so perhaps there'll be a new 'distinct' clause added to
the for expression to solve it). A recursive solution would be:

<xsl:function name="my:distinct">
  <xsl:param name="nodes" type="node*" select="()" />
  <xsl:param name="distinct" type="node*" select="()" />
  <xsl:variable name="new-distinct"
    select="if ($nodes[1] and
                some $n in ($distinct)
                satisfies (substring-before($n/@id, '.') =
                           substring-before($nodes[1]/@id, '.')))
            then ($distinct | $n)
            else $distinct" />
  <xsl:result select="if ($nodes)
                      then my:distinct($nodes[position() > 2],
                      else $distinct

Hmm... or a simple reverse function:

<xsl:function name="my:reverse">
  <xsl:param name="items" type="item*" select="()" />
  <xsl:param name="reversed" type="item*" select="()" />
  <xsl:result select="if ($items)
                      then my:reverse($items[position() > 2],
                                      ($items[1], $reversed))
                      else $reversed" />

I don't think you can currently do reverse even if you have a sort()
function or clause of some kind, because you can't get information
about the position of the item you're processing from within the
return clause of a for expression.

>> Are you after examples that indicate the shortfallings in the
>> regular expression syntax, the match() or replace() functions as
>> defined or something more general that illustrates that regular
>> expressions can't be used to process every kind of string?
> I just want to be sure that in case I decided to propose something,
> it would be based on solid cases that nobody could say could be
> easily solved doing this or that from XPath 2.0.

If you want a solid use case, I'd use David's example. Anything I come
up with will just be toy examples. A more accessible version of
David's example (arguably) would be the fairly common situation where
you get a source document where a snippet of HTML content is embedded
within a CDATA section within the XML:

      <img src="bionicle.gif" width=30 height=50>
      This product (<a href="details.html">details</a>) is great.

You need to parse the HTML content into XHTML. To make the task doable
as an example, the embedded HTML can contain a restricted set of
elements from HTML - img, a, b and i. Naturally, b and i can nest
inside a and each other. (I expect that a stylesheet that did this
would be a really popular module!)

Subtasks from this are:

Creating a regular expression that matches start and end tags in
content (the regular expression syntax doesn't include backreferences,
so you can't check that the name in the end tag is the same as the
name in the start tag) Examples:

  a. Here's some <b>bold <i>and italic</i></b> text.
  b. Here's some <i>italic <b>and bold</b></i> text.

Create a regular expression that would pull out the following:

  a. <b>bold <i>and italic</i></b>
  b. <i>italic <b>and bold</b></i>

Doing a replace that creates element structure rather than strings
(replace() doesn't do this at the moment). Because of this, you have
to use match(), but match() doesn't give you access to subexpression

A simple example is if you have dates in the format:


and you want to create:

  <date day="13" month="01" year="2002" />

This is achievable, but you end up running the same match three times
- once to get the day, once to get the month, once to get the year.
(Plus, I should note, there's no parse-date() function at the moment -
if there were it would be easier.)

These are the kinds of issues that David and I are trying to work
through at the moment - if you have some ideas we'd be really glad to
hear them.



Jeni Tennison

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Current Thread


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.