[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: A better way to construct regular expressions in XMLSchema

  • From: "Liam R. E. Quin" <liam@w3.org>
  • To: "Costello, Roger L." <costello@mitre.org>, "xml-dev@l..." <xml-dev@l...>
  • Date: Tue, 21 Nov 2017 14:51:42 -0500

Re:  A better way to construct regular expressions in XMLSchema
On Mon, 2017-11-20 at 12:35 +0000, Costello, Roger L. wrote:
> Hi Folks,
> I have an XML Schema that needs some complex regular expressions. I
> have been using <!ENTITY> to construct the regexes. See below. I find
> it pretty hard to debug these regular expressions. Is there a better
> way to construct regexes?
I'd probably use fewer of them and i might test them in a standalone
program (as long as one doesn't use any of the very minor differences
between XSD and the more widely-used Perl expressions), e.g.

#! /usr/bin/perl -w

my $LanguageTag = "(${langtag}) | (${privateuse}) | ( ${grandfathered}

and so on (in reverse order so everything is defined before it's used)
and then,

while (<>) {
  chomp; # remove whitespace at end of imput line
  if (m/${LanguageTag}/x) {
    print "OK: $_";
  } else {
     print "unmatched: $_;

Then if things don't match that I expected, i'd try matching against
individual components in the expression.

If that sort of scripting isn't comfortable for you though,you could
make an XML document with lots of test cases, one for each branch of
the regex, but the validator might stop on the first error, which is
less than helpful at times :) so in that case use XSLT to split it into
lots of different test case documents perhaps.

You could also test with XSLT or XQuery or standalone XPath 2 or later,
with replace(), e.g. replace($input, $pattern, "[1=$1,2=$2,3=$3]") to
see which ()-group matched. Watch that XSD patterns have different
meanings for \-escaped "special" thingies like \i, \a etc.

Finally, be careful about baking in a specific version of a spec into
software if it's not necessary - consider a future revision of the RFC
that adds something your regexp doesn't match...


Liam Quin, W3C, http://www.w3.org/People/Quin/
Staff contact for Verifiable Claims WG, SVG WG, XQuery WG

Web slave for http://www.fromoldbooks.org/

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.