[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message]

Any discussion of more Perl-like regex capability inXPath 2.1?

David Sewell dsewell at virginia.edu
Fri May 21 10:46:46 PDT 2010


  Any discussion of more Perl-like regex capability
	inXPath 2.1?
Michael, thanks for the detailed feedback. My most important question
was answered, namely whether the working group has talked about
expanding regex/replacement functionality, and your reasons for not
adding the particular Perl extensions I mentioned are well taken,

David S.

On Fri, 21 May 2010, Michael Kay wrote:

>
> There is one other extensions to regexes in 2.1, support for non-capturing
> groups has been added.
>
> I don't think there is willingness in the WG to trawl what's in Perl and add
> every feature that seems to make sense in an XPath context. Rather, features
> have been added if there is a strong use case, if the specification is
> well-defined in an XML/XPath context, and if the feature (not necessarily
> with identical syntax) appears to be supported in most of the regular
> expression libraries that XPath/XQuery implementors are likely to be using.
>
> Quite often with requests like this the difficulty is in finding a precise
> specification of what the Perl construct actually does. My first attempt to
> lookup the spec for \l and \u gives:
>
> "lowercase|uppercase next char (think vi)"
>
> Given a joke of a specification like that (and this isn't untypical), we
> have in the past ended up running tests on implementations to see what the
> feature actually does, so that we can specify behaviour that will be
> consistent with Perl. Reverse engineering a spec by running tests against
> implementations is very labour-intensive, and the WGs are short of resource,
> so we're naturally disinclined to do it.
>
> As far as I can see this isn't actually an addition to the syntax of regular
> expressions you are proposing, it's an addition to the syntax of the
> replacement string in the replace() function. I think our view in the WG is
> that operations more complex than can be done using replace() require a
> feature like XSLT's xsl:analyze-string instruction, and to this end we have
> added the function fn:analyze-string() which will become available in XQuery
> 1.1: see
>
> http://www.w3.org/TR/2009/WD-xpath-functions-11-20091215/#func-analyze-strin
> g
>
> I think this achieves what you are looking for in a much more general and
> powerful way.
>
> Regards,
>
> Michael Kay
> http://www.saxonica.com/
> http://twitter.com/michaelhkay
>
>
>
> > -----Original Message-----
> > From: http://x-query.com/mailman/listinfo/talk
> > [mailto:http://x-query.com/mailman/listinfo/talk] On Behalf Of David Sewell
> > Sent: 20 May 2010 18:27
> > To: XQuery Talk
> > Subject:  Any discussion of more Perl-like regex
> > capability inXPath 2.1?
> >
> > So far as I can see in the working draft of "XPath and XQuery
> > Functions and Operators 1.1", the only real extension to
> > regular expressions is the addition of a "q" flag in matches
> > to block metacharacter
> > interpretation:
> >
> > http://www.w3.org/TR/xpath-functions-11/#flags
> >
> > I'm just curious whether there has been discussion of adding
> > any more Perl-like expressivity to XPath regular expressions.
> > For example, being able to use the escape sequences \u, \l,
> > etc. in replacement patterns would allow concision that is
> > currently not legal:
> >
> >  let $s := "from George Washington, 17 May 1789"
> >  return replace($s, '^(.+),.*', '\u$1')   (: ILLEGAL :)
> >
> >  ==> "From George Washington"
> >
> > instead of the verbosity required currently to do the same thing:
> >
> >   let $s := "from George Washington, 17 May 1789"
> >   return concat(
> >       upper-case(substring($s,1,1)),
> >       replace(substring($s,2), '^(.+),.*','$1')
> >       )
> >
> > I understand this would impose more burden on implementers,
> > but has there been demand from the user community (including
> > XSLT folk)?
> > (Count this email as "demand", of course.)
> >
> > DS
> >
> > --
> > David Sewell, Editorial and Technical Manager ROTUNDA, The
> > University of Virginia Press PO Box 801079, Charlottesville,
> > VA 22904-4318 USA
> > Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
> > Email: http://x-query.com/mailman/listinfo/talk   Tel: +1 434 924 9973
> > Web: http://rotunda.upress.virginia.edu/
> > _______________________________________________
> > http://x-query.com/mailman/listinfo/talk
> > http://x-query.com/mailman/listinfo/talk
>
>

-- 
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 801079, Charlottesville, VA 22904-4318 USA
Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
Email: http://x-query.com/mailman/listinfo/talk   Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2011 All Rights Reserved.