[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Schematron: Categories of Usage?

  • From: "Costello, Roger L." <costello@m...>
  • To: <xml-dev@l...>
  • Date: Mon, 22 Jan 2007 08:15:08 -0500

RE:  Schematron: Categories of Usage?
Excellent!  Thanks Bryan.

Bryan has identified another way that Schematron may be used for
checking data in an XML instance document:

Algorithmic Checking: the validity of data in an XML instance document
is determined not by mere examination or comparison of the data, but
requires performing an algorithm on the data.

Here are the ways that Schematron is being used today:

1. Co-constraint checking
2. Existence checking
3. Algorithmic checking

Any others?

/Roger


-----Original Message-----
From: bryan rasmussen [mailto:rasmussen.bryan@g...] 
Sent: Monday, January 22, 2007 7:14 AM
To: Costello, Roger L.
Cc: xml-dev@l...
Subject: Re:  Schematron: Categories of Usage?

Algorithmic checking:

the following checks the algorithm of EAN Location numbers, after the
algorithm found here
http://www.ean.dk/EAN_Sys/helpdesk/faq/kntrlcif.htm#EAN%20Lokationsnumm
er
(sorry, it's in Danish):

<sch:rule context="*[@schemeID]">
    <sch:report test="@schemeID='EAN' and string-length(.) != 13">
WARNING: EAN numbers are 13 digits in length
</sch:report>
<sch:report test="@schemeID='EAN' and . != (. + 1) - 1">
WARNING: EAN numbers are 13 digits in length</sch:report>
<sch:report test="@schemeID='EAN' and substring(.,13,1)!=0 and ((((10
- substring((substring(.,1,1) * 1 + substring(.,2,1) * 3) +
(substring(.,3,1) * 1 + substring(.,4,1) * 3) + (substring(.,5,1) * 1
+ substring(.,6,1) * 3) + (substring(.,7,1) * 1 + substring(.,8,1) *
3) + (substring(.,9,1) * 1 + substring(.,10,1) * 3) +
(substring(.,11,1) * 1 + substring(.,12,1) *
3),string-length((substring(.,1,1) * 1 + substring(.,2,1) * 3) +
(substring(.,3,1) * 1 + substring(.,4,1) * 3) + (substring(.,5,1) * 1
+ substring(.,6,1) * 3) + (substring(.,7,1) * 1 + substring(.,8,1) *
3) + (substring(.,9,1) * 1 + substring(.,10,1) * 3) +
(substring(.,11,1) * 1 + substring(.,12,1) * 3)),1)) +
((substring(.,1,1) * 1 + substring(.,2,1) * 3) + (substring(.,3,1) * 1
+ substring(.,4,1) * 3) + (substring(.,5,1) * 1 + substring(.,6,1) *
3) + (substring(.,7,1) * 1 + substring(.,8,1) * 3) + (substring(.,9,1)
* 1 + substring(.,10,1) * 3) + (substring(.,11,1) * 1 +
substring(.,12,1) * 3))) - ((substring(.,1,1) * 1 + substring(.,2,1) *
3) + (substring(.,3,1) * 1 + substring(.,4,1) * 3) + (substring(.,5,1)
* 1 + substring(.,6,1) * 3) + (substring(.,7,1) * 1 + substring(.,8,1)
* 3) + (substring(.,9,1) * 1 + substring(.,10,1) * 3) +
(substring(.,11,1) * 1 + substring(.,12,1) * 3))) != substring(.,13,1)
)">
there is an improperly formatted EAN number.


</sch:report>
<sch:report test="@schemeID='EAN' and substring(.,13,1) =0 and
substring((substring(.,1,1) * 1 + substring(.,2,1) * 3) +
(substring(.,3,1) * 1 + substring(.,4,1) * 3) + (substring(.,5,1) * 1
+ substring(.,6,1) * 3) + (substring(.,7,1) * 1 + substring(.,8,1) *
3) + (substring(.,9,1) * 1 + substring(.,10,1) * 3) +
(substring(.,11,1) * 1 + substring(.,12,1) *
3),string-length((substring(.,1,1) * 1 + substring(.,2,1) * 3) +
(substring(.,3,1) * 1 + substring(.,4,1) * 3) + (substring(.,5,1) * 1
+ substring(.,6,1) * 3) + (substring(.,7,1) * 1 + substring(.,8,1) *
3) + (substring(.,9,1) * 1 + substring(.,10,1) * 3) +
(substring(.,11,1) * 1 + substring(.,12,1) * 3)),1) != 0">
there is an improperly formatted EAN number.
</sch:report>

don't worry, verbosity isn't a concern in XML.  :)

The same principals can be used to implement a great number of
algorithms where the boundaries of the problem are know, as in this
case I know that this sequence is 13 characters in length, not less
nor more.

Actually because of the way schematrons assert works one can do checks
on sequences where the possible upper bound is known but not if the
upper bound is actually reached.

I did a proof of this recently (generated the code of course, it took
86 assertions to implement the check), the requirement was that for a
text string the space between each linefeed was no longer than 37
characters, and there could not be more than 45 linefeeds.

The generated assertions were of course that the string-length of the
string between line feed 1 and 2 was less than 38.
the string-length of the string between line feed 2 and 3 was less
than 38 and so forth.

If there were only two line feeds the other assertions did not return
false due to wording.

It took 86 assertions because I split on if the ending line had to end
with a line feed. Unfortunately my laptop burnt out (nothing to do
with this example) and I hadn't backed it up because it was a sort of
a fun experiment. Not for actual use.

This was in Schematron 1.5 not Iso, it would be a lot easier to write
this stuff in ISO. Of course others out there could probably optimize
the code, but it has been checking EAN numbers for a year and a half
now and nobody has submitted an error yet. (fingers crossed)

Cheers,
Bryan Rasmussen





On 1/22/07, Costello, Roger L. <costello@m...> wrote:
> Hi Folks,
>
> I am putting together a list of ways that Schematron is being used.
I
> seek your help in ensuring that the list is complete. (I will post
the
> final list)
>
> Let me give an example to show what I mean by "ways that Schematron
is
> being used".
>
> Consider this simple XML instance document:
>
> <?xml version="1.0"?>
> <Document>
>      <Classification>unclassified</Classification>
>      <Para>
>           Lorem ipsum dolor sit amet,
>           laoreet ac convallis dictumst
>      </Para>
>      <Classification>unclassified</Classification>
> </Document>
>
> Schematron can be used to specify, "The Classification value at the
top
> and bottom of the document must match; the Para element must not
> contain any restricted keywords."
>
> Thus, we see Schematron being used to express these two types of data
> constraints:
>
> 1. Co-constraints: in the example the co-constraint is between the
two
> Classification values; namely, the two values must be identical.  In
> general, co-constraints are constraints that exist between data
> (element-to-element co-constraints, element-to-attribute,
> attribute-attribute).  The co-constraints may be "within" an XML
> document, or "across" XML documents.
>
> Schematron is very well-suited to expressing co-constraints.
>
> 2. Existence: in the example the existence constraint is that the
Para
> element must not contain any restricted keywords.  The keywords may
be
> obtained dynamically from another file. In general, existence
> constraints are constraints on the presence or absence of data.  The
> existence constraints may apply over the entire document, or to just
> portions of the document.
>
> Schematron is very well-suited to expressing existence constraints.
>
> Categories of Schematron Usage
>
> Here are the ways that Schematron is being used today:
>
> 1. Co-constraint checking
> 2. Existence checking
>
> Are you using Schematron in ways not represented by these two
> categories?  I am particularly interested in identifying ways
> Schematron is being used which cannot be expressed by other schema
> languages - XML Schemas, Relax NG.
>
> /Roger
>
>
_______________________________________________________________________
>
> XML-DEV is a publicly archived, unmoderated list hosted by OASIS
> to support XML implementation and development. To minimize
> spam in the archives, you must subscribe before posting.
>
> [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
> Or unsubscribe: xml-dev-unsubscribe@l...
> subscribe: xml-dev-subscribe@l...
> List archive: http://lists.xml.org/archives/xml-dev/
> List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
>
>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.