[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: datatype functionality I'd like to see


ssn format regex
Hi,

> Perhaps I am uninformed however, can anyone think of any particular
> schema language one can do this in, and if you are the person who
> knows of such a language can you give me an example if possible.
> (not that it's something I need to do, just something I thought
> would be extremely useful to be able to do at some point)

This was one of the features of the datatype library language that I
have been working on [1]. You could do something like (bearing in mind
I don't know how SSNs actually work):

  <define name="Digit">
    <charGroup><range from="0" to="9" /></charGroup>
  </define>

  <datatype name="SSN">
    <parse>
      <group name="state">
        <repeat exactly="3"><ref name="Digit" /></repeat>
      </group>
      <string>-</string>
      <group name="individual">
        <repeat exactly="2"><ref name="Digit" /></repeat>
        <string>-</string>
        <repeat exactly="4"><ref name="Digit" /></repeat>
      </group>
    </parse>
    ...
  </datatype>

and in the rest of the datatype definition you'd work with a tree
containing <state> and <individual> elements. For example, the SSN
123-12-1234 would become:

  <SSN><state>123</state>-<individual>12-1234</individual></SSN>

At http://www.jenitennison.com/datatypes/#implementation, there's an
implementation that transforms the datatype library syntax into an
XSLT 2.0 stylesheet that contains a bunch of functions for each
datatype. You could probably do something with Schematron such that
you declare the datatypes in the Schematron schema and then use them
in the test expressions, as long as you were happy using an XSLT 2.0
processor, but I haven't pursued that.
  
I'm currently in the process of revising the language I initially came
up with so that (among other changes) you can just use named
subexpressions within a regular expression; something like:

  <datatype name="SSN">
    <format>
      <regex>(?[state][0-9]{3})-(?[individual][0-9]{2}-[0-9]{4})</regex>
    </format>
    ...
  </datatype>

or use other (extensible) methods for expressing the format of a
value, such as BNF or PEGs or whatever the particular datatype library
processor understands, but it's all work in progress...

Cheers,

Jeni

[1] http://www.jenitennison.com/datatypes/

---
Jeni Tennison
http://www.jenitennison.com/


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.