[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Use Schematron to formally define key concepts

  • From: "Costello, Roger L." <costello@mitre.org>
  • To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
  • Date: Sun, 5 Mar 2017 14:46:21 +0000

Use Schematron to formally define key concepts

Hi Folks,

 

 

For many years, interoperability of a certain XML Schema was achieved only by twice-yearly “bake-offs.” A bake-off was an event at which engineers from various vendors gathered in one room, with all their equipment, to test and re-program until their equipment interoperated on the test cases. Even for the baseline schema (the schema before it was extended), engineers would spend many hours trying to get the answers to simple questions such as, “Can an application endpoint in state S send an XML message of type m?” Engineers would search the schema and other documents for clues, and argue the meanings of elements and attributes like Biblical scholars. Certainty was rarely achieved.

 

 

XML Schemas are only slightly more formal than natural language descriptions. Users are right to distrust XML Schemas.

 

Use Schematron to formally define key concepts.

 

Let’s take an example. Consider the following excerpt from an XML Schema for a network protocol.

 

<xs:element name="node">
   
<xs:complexType>
       
<xs:sequence>
           
<!-- node stuff -->
       
</xs:sequence>
       
<xs:attribute name="following" type="nodeId" use="required" />
       
<xs:attribute name="id" type="nodeId" use="required" />
   
</xs:complexType>
</xs:element>

 

The schema states that a node can follow another node. Follow in what sense?

 

If “following” means following pointers, then the statement is a tautology. If “following” means following in integer order on node identifiers, then it is more meaningful but still wrong – the successor of node 40 may be 5, which does not follow it in integer order (node identifiers come from a bounded set of natural numbers, and the identifiers wrap around from the highest number to zero).

 

In fact, it is not useful to define “following” in such an ordering because every identifier follows (and precedes) every other identifier. A more useful concept is that of “between,” defined by the following Schematron predicate (XSLT functions may be embedded in Schematron):

 

<!--
        The predicate Between is true if and only if
        argument n2 lies between arguments n1 and n3
-->

<xsl:function name="pred:Between">
   
<xsl:param name="n1" />
   
<xsl:param name="n2" />
   
<xsl:param name="n3" />
   
    
<xsl:choose>
       
<xsl:when test="number($n1) lt number($n3)">
           
<xsl:if test="(number($n1) lt number($n2)) and
                                 (number($n2) lt number($n3))"
>true</xsl:if>
       
</xsl:when>
       
<xsl:otherwise>
           
<xsl:if test="(number($n1) lt number($n2)) or
                                 (number($n2) lt number($n3))"
>true</xsl:if>
       
</xsl:otherwise>
   
</xsl:choose>
   
</xsl:function>

 

Here is a valid XML instance document:

 

<network>
   
<node id="5" following="29"/>
   
<node id="29" following="40"/>
   
<node id="40" following="5"/>
</network>

 

For each node n1, there is no node between n1 and n1.following. For example, following node 5 is 29, and there are no nodes between them. Following node 40 is 5, and there are no nodes between them (this might seem counterintuitive, but check the Between predicate and you will see that 29 is not between 40 and 5. “Intuition” is often wrong – another reason that formal definitions are needed).

This is an invalid XML instance document:

 

<network>
   
<node id="5" following="40"/>
   
<node id="29" following="40"/>
   
<node id="40" following="5"/>
</network>

 

Following node 5 is 40, but there is a node between them (node 29).

 

We are ready to formally define the “following” concept:

 

<!--
        Formal definition of the "following" concept.
       
        let nodes = { nodes in the network }
        all disj n1, n2, n3: node |
            n2 = n1.following
                => ! Between[n1,n3,n2]
-->
<sch:pattern id="Formal-Definition-of-following">
   
<sch:rule context="network">
       
<sch:let name="nodes" value="node"/>
       
<sch:assert test="
            every $n1 in $nodes, $n2 in $nodes, $n3 in $nodes satisfies
                if (pred:Disjoint($n1, $n2, $n3)) then
                    if (number($n2/@id) eq number($n1/@following)) then
                        not(pred:Between($n1/@id, $n3/@id, $n2/@id))
                    else true()
                else true()
            ">
            No third node falls between a node and its following node.
       
</sch:assert>
   
</sch:rule>
</sch:pattern>

“If n1 and n2 are distinct network nodes, and n2 is the successor of n1, then no third network node falls between them.”

 

An XML Schema without a Schematron schema is dangerously ambiguous.

 

/Roger

 

P.S.#1 Everything said about the dangerous ambiguity of XML Schemas also applies to UML.

 

P.S.#2 Acknowledgement: Some of the ideas presented herein, even some sentences, come from this fantastic paper by Pamela Zave (AT&T Labs, Princeton University): http://web2.research.att.com/export/sites/att_labs/people/Zave_Pamela/custom/wripe.pdf

 

Here is the complete Schematron schema:

<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron"
            xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
            queryBinding="xslt2">
   
    
<sch:ns uri="predicate" prefix="pred"/>
    
    
<!--
        The predicate Between is true if and only if
        argument n2 lies between arguments n1 and n3
    -->
   
<xsl:function name="pred:Between">
       
<xsl:param name="n1" />
       
<xsl:param name="n2" />
       
<xsl:param name="n3" />
       
        
<xsl:choose>
           
<xsl:when test="number($n1) lt number($n3)">
               
<xsl:if test="(number($n1) lt number($n2)) and
                                     (number($n2) lt number($n3))"
>true</xsl:if>
           
</xsl:when>
           
<xsl:otherwise>
               
<xsl:if test="(number($n1) lt number($n2)) or
                                     (number($n2) lt number($n3))"
>true</xsl:if>
           
</xsl:otherwise>
       
</xsl:choose>
       
    
</xsl:function>
   
    
<!--
        The predicate Disjoint is true if and only if
        all three arguments are different
    -->
   
<xsl:function name="pred:Disjoint">
       
<xsl:param name="n1" />
       
<xsl:param name="n2" />
       
<xsl:param name="n3" />
       
        
<xsl:choose>
           
<xsl:when test="$n1 is $n2" />
           
<xsl:when test="$n1 is $n3" />
           
<xsl:when test="$n2 is $n3" />
           
<xsl:otherwise>true</xsl:otherwise>
       
</xsl:choose>
       
    
</xsl:function>
   
    
<!--
        Formal definition of the "following" concept.
       
        let nodes = { nodes in the network }
        all disj n1, n2, n3: node |
            n2 = n1.following
                => ! Between[n1,n3,n2]
    -->
   
<sch:pattern id="Formal-Definition-of-following">
       
<sch:rule context="network">
           
<sch:let name="nodes" value="node"/>
           
<sch:assert test="
                every $n1 in $nodes, $n2 in $nodes, $n3 in $nodes satisfies
                    if (pred:Disjoint($n1, $n2, $n3)) then
                        if (number($n2/@id) eq number($n1/@following)) then
                            not(pred:Between($n1/@id, $n3/@id, $n2/@id))
                        else true()
                    else true()
                ">
                No third node falls between a node and its following node.
           
</sch:assert>
       
</sch:rule>
   
</sch:pattern>
   
</sch:schema>

 

Here is the XML Schema:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
   
    
<xs:element name="network">
       
<xs:complexType>
           
<xs:sequence>
               
<xs:element ref="node" maxOccurs="unbounded" />
           
</xs:sequence>
       
</xs:complexType>
   
</xs:element>
   
    
<xs:element name="node">
       
<xs:complexType>
           
<xs:sequence>
                
<!-- node stuff -->
           
</xs:sequence>
           
<xs:attribute name="following" type="nodeId" use="required" />
           
<xs:attribute name="id" type="nodeId" use="required" />
       
</xs:complexType>
   
</xs:element>
   
    
<xs:simpleType name="nodeId">
       
<xs:restriction base="xs:unsignedByte"/>
   
</xs:simpleType>
   
</xs:schema>

 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.