[XML-DEV Mailing List Archive Home]
[By Thread]
[By Date]
[Recent Entries]
[Reply To This Message]
Re: The awesome power of Schematron + XPath 2.0 ... Able to ex
- From: "Fraser Goffin" <goffinf@g...>
- To: "Costello, Roger L." <costello@m...>
- Date: Thu, 25 Oct 2007 17:47:51 +0100
See earlier comments from David Carlisle. It is possible to set the Saxon extension attribute :-
saxon:allow-all-built-in-types
and that will allow use of built in types that are otherwise not permitted under the W3C conformance rules for a Basic XSLT Processor (apparently :-).
I tried this by slightly amending the iso skeleton (for saxon) and the stylesheet produced did then work successfully with SaxonB.
Fraser.
On 25/10/2007, Costello, Roger L. <costello@m...> wrote:
Oops! Thanks Fraser. I will fix the bugs. Yes, Saxon-SA is required for using the expression "castable as xs:nonNegativeInteger". /Roger
From: Fraser Goffin [mailto:goffinf@g...] Sent: Thursday, October 25, 2007 7:56 AM
To: xml-dev@l... Subject: Re: The awesome power of Schematron + XPath
2.0 ... Able to express all my data requirements!
Hi Roger,
since I am thinking of moving both to ISO schematron and XSLT2, I gave your example above a try. For the present I am using :-
- SaxonB (v8.9)
- iso_schematron_skeleton_for_saxon.xsl
I encountered a couple of issues :-
1. When creating the implementation stylesheet Saxon emitted the warning :-
'Running an XSLT 1.0 stylesheet with an XSLT 2.0 processor'
It nonetheless produced the stylesheet successfully, so not sure why ?
2. When running this stylesheet against the sample input a couple of errors occurred which I think are because :-
>> (string-length <= 200) and
should be :-
(string-length() <= 200) and
and :-
>> (matches(.,'[\sa-zA-Z0-9,;:\.]*))
should be (quote missing after the *) :-
(matches(.,'[\sa-zA-Z0-9,;:\.]*'))"
3. Neither the test for xs:nonNegativeInteger or xs:long would work with SaxonB, it said :-
'The type xs:nonNegativeInteger is not recognised by a basic XSLT processor'
Changing both to xs:integer was OK, so I don't know if this is a limitation of SaxonB which will go away when I switch to SaxonSA ?
Regards
Fraser.
On 25/10/2007, Costello, Roger L. <costello@m...
> wrote:
Hi Folks,
A few days ago Rick Jelliffe mentioned some of the new capabilities that XPath 2.0 adds to Schematron.
The things that he mentioned sounded very exciting to me, so I put together what is for me a typical set of data requirements. I then implemented those data requirement using Schematron+XPath 2.0. Then,
for comparison, I attempted to implement the same data requirements using XML Schemas.
It was a very enlightening experience. Schematron+XPath 2.0 was able to implement all of my data requirements (including all grammar
constraints). Conversely, XML Schemas was only able to implement the grammar constraints (which are actually of lesser importance to me than my other data requirements).
Of course, this represents only one example; other examples must be
explored. Nonetheless, the fact that Schematron+XPath 2.0 could implement all of my (fairly extensive) data requirements is very exciting.
Below is my set of data requirements followed by the Schematron+XPath
2.0 implementation, as well as the XML Schema implementation. Perhaps you have similar data requirements?
Thanks Rick!
/Roger
-----------------------------------------------------------------------
--
HIGHLIGHTS OF WHAT I DISCOVERED
Schematron+XPath 2.0 was able to express:
- a security classification policy (data requirement #2) - a reserved word filter (data requirement #3) - data integrity checks, including a hashcode check (data requirement
#4) - tracebacks from implementation to data requirements, for accreditation purposes (data requirement #5) - backward and forward compatibility in a safe fashion (data requirement #6) - validation in stages,
e.g. perform a security classification check first, and if it succeeds only then perform a reserved word check, etc (data requirement #7) - all grammar constraints (that are normally implemented using XSD or RNG) (data requirements #1 and #8)
Conversely, XML Schemas was only able to express the grammar constraints (data requirements, #1 and #8). It was unable to express the other data requirements (#2 - #7).
-----------------------------------------------------------------------
--
SAMPLE XML INSTANCE DOCUMENT (i.e. SAMPLE DATA)
<?xml version="1.0" encoding="UTF-8"?> <Document classification="secret"> <NumParas>4</NumParas>
<Para classification="unclassified"> One if by land, two if by sea; </Para> <Para classification="confidential"> And I on the opposite shore will be,
Ready to ride and spread the alarm </Para> <Para classification="unclassified"> Ready to ride and spread the alarm Through every Middlesex, village and farm,
</Para> <Para classification="secret"> For the country folk to be up and to arm. </Para> <Hash>304</Hash> </Document>
-----------------------------------------------------------------------
--
DATA REQUIREMENTS
1. ** DOCUMENT ORGANIZATION **
1.1 The document is comprised of one or more paragraphs.
1.2 Each paragraph is labeled with a classification, which can be one of top-secret, secret, confidential, or unclassified.
1.3 A paragraph's text must not exceed 200 characters in length, and shall be comprised of only these characters: a-z, A-Z, 0-9, whitespace, comma, period, colon, semi-colon.
1.4 The document has an overall classification, which can also be one
of top-secret, secret, confidential, or unclassified.
1.5 The information in the document may be ordered in any way the author sees fit.
2. ** SECURITY CLASSIFICATION POLICY **
2.1 No paragraph may have a classification higher than the overall
document classification.
3. ** RESERVED WORD FILTER **
3.1 No paragraph may contain these reserved words: SCRIPT, FUNCTION.
4. ** DATA INTEGRITY CHECKS **
4.1 The document must contain a count of the number of paragraphs in
the document, and that count must match the actual number of paragraphs.
4.2 The document must contain a hashcode, and that hashcode must match the hash of the document.
5. ** ACCREDITATION **
5.1 For accreditation purposes an implementation of any one of these requirements must reference the specific requirement that it is implementing.
6. ** FUTURE REQUIREMENTS **
6.1 Additional future requirements must be backward and forward
compatible.
7. ** VALIDATION IN STAGES **
7.1 It must be possible to validate the data in stages, e.g. check the data against the security policy and only perform the other checks if it succeeds.
8. ** XML GRAMMAR **
8.1 The root element is <Document>.
8.2 <Document> has one attribute, classification, whose value can be one of top-secret, secret, confidential, or unclassified.
8.3 <Document> is comprised of one <NumParas>, one or more <Para>, and one <Hash>.
8.3.1 These child elements may occur in any order.
8.4 Each <Para> has one attribute, classification, whose value can be
one of top-secret, secret, confidential, or unclassified.
8.5 The value of each <Para> is a string, constrained to a maximum of 200 characters, comprised of only these characters: a-z, A-Z, 0-9, whitespace, comma, period, colon, semi-colon.
8.6 The value of <NumParas> is a nonNegativeInteger.
8.7 The value of <Hash> is a long.
----------------------------------------------------------------------- --
XML SCHEMA IMPLEMENTATION
<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="Document"> <xs:complexType> <xs:sequence> <xs:element name="NumParas"
type="xs:nonNegativeInteger"/> <xs:element name="Para" maxOccurs="unbounded"> <xs:complexType> <xs:simpleContent>
<xs:extension base="paraType"> <xs:attribute name="classification" type="classificationLevels" use="required"/>
</xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name="Hash" type="xs:long"/>
</xs:sequence> <xs:attribute name="classification" type="classificationLevels" use="required"/> </xs:complexType> </xs:element>
<xs:simpleType name="classificationLevels"> <xs:restriction base="xs:string"> <xs:enumeration value="top-secret"/> <xs:enumeration value="secret"/>
<xs:enumeration value="confidential"/> <xs:enumeration value="unclassified"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="paraType">
<xs:restriction base="xs:string"> <xs:maxLength value="200"/> <xs:pattern value="[\sa-zA-Z0-9,;:\.]*"/> </xs:restriction>
</xs:simpleType> </xs:schema>
----------------------------------------------------------------------- --
SCHEMATRON IMPLEMENTATION
<?xml version="1.0"?> <sch:schema xmlns:sch="
http://purl.oclc.org/dsdl/schematron" xmlns:xs=" http://www.w3.org/2001/XMLSchema" queryBinding="xslt2">
<sch:let name="document-classification" value="/Document/@classification" />
<sch:pattern id="SECURITY-CLASSIFICATION-POLICY">
<sch:p>A Para's classification value cannot be more sensitive than the Document's classification value.</sch:p>
<sch:rule context="Para[@classification='top-secret']">
<sch:assert test="$document-classification='top-secret'" see="Data Requirement
2.1"> If there is a Para labeled "top-secret" then the Document
must be labeled top-secret </sch:assert>
</sch:rule>
<sch:rule context="Para[@classification='secret']">
<sch:assert test="($document-classification='top-secret') or ($document-classification='secret')" see="Data Requirement
2.1"> If there is a Para labeled "secret" then the Document must be labeled either secret or top-secret </sch:assert>
</sch:rule>
<sch:rule context="Para[@classification='confidential']">
<sch:assert test="($document-classification='top-secret') or ($document-classification='secret') or ($document-classification='confidential')"
see="Data Requirement 2.1"> If there is a Para labeled "confidential" then the Document must be labeled either confidential, secret or top-secret
</sch:assert>
</sch:rule>
</sch:pattern>
<sch:pattern id="RESERVED-WORD-FILTER">
<sch:p>These reserved words are not allowed anywhere in the
document: SCRIPT, FUNCTION.</sch:p>
<sch:rule context="Document">
<sch:assert test="count(//node()[contains(.,'SCRIPT')]) = 0 and
count(//node()[contains(.,'FUNCTION')]) = 0" see="Data Requirement 3.1"> The document must not contain the words SCRIPT or FUNCTION
</sch:assert>
</sch:rule>
</sch:pattern>
<sch:pattern id="DATA-INTEGRITY-CHECKS">
<sch:p>The count value in the NumParas element must match
a count of the actual number of Para elements. And the hash value must match the value obtained by recomputing the hash on the current document.</sch:p>
<sch:rule context="NumParas">
<sch:assert test=". = ../count(Para)" see="Data Requirement 4.1"> The count value in this element must match a count of the actual number of Para elements
</sch:assert>
</sch:rule>
<sch:rule context="Hash">
<sch:assert test=". = sum(for $i in //*[not(*) and not(self::Hash)] return for $j in string-length($i)
return $j)" see="Data Requirement 4.2"> The value of Hash must match the value that is obtained by recomputing the hash algorithm on the current document
</sch:assert>
</sch:rule>
</sch:pattern>
<sch:pattern id="GRAMMAR">
<sch:rule context="/*">
<sch:assert test="name() = 'Document'"
see="Data Requirement 8.1"> The root element must be Document </sch:assert>
<sch:assert test="(count(@*) = 1) and (name(@*) = 'classification') and
(($document-classification='top-secret') or ($document-classification='secret') or ($document-classification='confidential') or
($document-classification='unclassified'))" see="Data Requirement 8.2"> Document has one attribute, classification, whose value
can be one of top-secret, secret, confidential, or unclassified. </sch:assert>
</sch:rule>
<sch:rule context="/Document">
<sch:assert test="(count(NumParas) = 1) and
(count(Para) >= 1) and (count(Hash) = 1) and (count(*[name() !='NumParas' and name() != 'Paras' and
name() != 'Hash']) = 0)" see="Data Requirement 8.3 and 8.3.1"> Document is comprised of one NumParas, one or more Para,
and one Hash. These child elements may occur in any order. </sch:assert>
</sch:rule>
<sch:rule context="Para">
<sch:assert test="(count(@*) = 1) and
(name(@*) = 'classification') and (($document-classification='top-secret') or ($document-classification='secret') or
($document-classification='confidential') or ($document-classification='unclassified'))" see="Data Requirement 8.4
"> Para has one attribute, classification, whose value can be
one of top-secret, secret, confidential, or unclassified. </sch:assert>
<sch:assert test="(. castable as xs:string) and
(string-length <= 200) and (matches(.,'[\sa-zA-Z0-9,;:\.]*))" see="Data Requirement 8.5"> The value of a Para is a string, constrained to a maximum
of 200 characters, comprised of only these characters: a-z, A-Z, 0-9, comma, period, colon, semi-colon. </sch:assert>
</sch:rule>
<sch:rule context="NumParas">
<sch:assert test=". castable as xs:nonNegativeInteger" see="Data Requirement 8.6"> The value of NumParas is a nonNegativeInteger. </sch:assert>
</sch:rule>
<sch:rule context="Hash">
<sch:assert test=". castable as xs:long" see="Data Requirement 8.7">
The value of Hash is a long. </sch:assert>
</sch:rule>
</sch:pattern>
</sch:schema>
_______________________________________________________________________
XML-DEV is a publicly archived, unmoderated list hosted by OASIS to support XML implementation and development. To minimize spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address:
http://www.oasis-open.org/mlmanage/ Or unsubscribe:
xml-dev-unsubscribe@l... subscribe: xml-dev-subscribe@l... List archive:
http://lists.xml.org/archives/xml-dev/ List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
|
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format
RSS 2.0 |
|
Atom 0.3 |
|
|
Stylus Studio has published XML-DEV in RSS and ATOM formats,
enabling users to easily subcribe to the list from their preferred news reader application.
|
Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website.
they were not included by the author in the initial post. To view the content without the Sponsor Links please
click here.
|
|