[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Transform XSD complexType into regular expressions

  • From: Michael Kay <mike@saxonica.com>
  • To: Roger L Costello <costello@mitre.org>
  • Date: Thu, 12 May 2022 12:34:01 +0100

Re:  Transform XSD complexType into regular expressions
Testing whether a sequence of element names matches the content model of a complex type is fairly straightforward in principle. You take the particles in the content model and for each one you generate a term in the regular expression of the form `(Q\{uri\}local~){minOccurs,maxOccurs}`; for wildcards in the content model you generate a term in the regular expression that only matches the correct namespaces. Then you take the actual sequence of element names found in the instance document, form a ~-separated string of their EQNames, and match this against the regular expression.

You can of course abbreviate the names to something shorter provided you do it consistently; you could even abbreviate to single characters (Unicode has plenty available) and drop the separators.

But of course, you then find there are complications. If an element in the instance document matches a wildcard particle, then you need to know whether the wildcard particle specified strict or lax validation for its own content; the regex approach doesn't directly give you that information. Handling XSD 1.1 open content models gets tricky, etc. But I think the biggest obstacle is that you get very poor diagnostics for invalidities.

The approach depends on the "Element declarations consistent" constraint, which means that to validate a child element, you only need to know its name, you don't need to know which particle it matched -- unless it was a wildcard.

Michael Kay
Saxonica

> On 12 May 2022, at 11:29, Roger L Costello <costello@mitre.org> wrote:
> 
> Michael Kay wrote:
> ---------------------------------------
> I've considered the approach of validating complex types by turning them into regular expressions against a string and using a regex engine. The main reason I decided against it is that regex engines produce no useful diagnostics; they just tell you the string doesn't match. Perhaps the answer to that would be to write a regex engine with better diagnostics - I can see that being useful!
> ---------------------------------------
> 
> Oh, that sounds wicked cool. Michael, would you describe at a high level how one would go about converting a complexType into regular expressions, please? Would one complexType be transformed into one regular expression or into multiple regular expressions?
> 
> /Roger
> 
> _______________________________________________________________________
> 
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
> 
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
> subscribe: xml-dev-subscribe@lists.xml.org
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
> 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.