[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Cross document validation with Schematron - XML syntax for
There is a proposed fine-grained XML syntax for XQuery called XQueryX at http://www.w3.org/TR/2003/WD-xqueryx-20031219/ Since XPath is a subset of XQuery, this would appear to meet the need. Michael Kay # -----Original Message----- # From: Hunsberger, Peter [mailto:Peter.Hunsberger@s...] # Sent: 17 March 2004 21:06 # To: xml-dev@l... # Subject: RE: Cross document validation with # Schematron - XML syntax for Xpath? # # Given the verbosity of this post I'm not surprised that it # didn't garner much response, but I am a tad surprised that no # on had any comments what-so-ever. # # Is there somewhere else to go for Schematron advice? # # No comments about the heresy of questioning the need for an # XML syntax for Xpath? # # > # > First some background: we have a large complex Web app that builds # > 1000's of different input forms from metadata descriptions # in the form # > of XML. This XML comes from many different spots and # describes global # > metadata, user view specific metadata, authorizations and # the current # > data for a given screen. XSLT transforms take this input # XML smashes # > it together into an abstract object mode and this is in # turn forwarded # > on to another XSLT that does presentation specific transformations # > (for the Web app that means turning it into XHTML). # > # > The user does a standard HTTP POST back to us, and we get # the request # > parameters back as XML and run another XSLT transform and then a # > Schematron transform to validate the input from the user. If # > Schematron throws an assert we detect that and recycle the # input back # > through the original loop with the appropriate error # message otherwise # > we continue on to the next screen. # > # > The original metadata and the instance specific screens # that are built # > around them are built by business analysts using other # screens (that # > are in turn built by the same system). # > In particular, we have a validation editor where they describe the # > validation rules for the input to any given screen. These # rules are # > one step removed from Schematron statements; a simple # transform turns # > them into Schematron. # > The main reason for not specifying Schematron directly is # so that the # > "validation editor" can pick the rules into component pieces when a # > business analysts wants to go back and edit an existing validation # > rule; we use XML elements and attributes to build the # Xpath, that way # > we don't have to parse the Xpath (though we're probably # going to go to # > XSLT for regex support so I suppose parsing the Xpath with # regex would # > be just about the same work in the long term). # > # > All this works pretty well, but for one issue which I will describe # > shortly. However, we now have a new requirement which is # to be able # > to validate across multiple documents. # > We manage clinical research data, so an example would be # for someone # > to be able to specify that a surgery date was after any protocol on # > study date, or that a surgery date is after a particular # instance of # > an protocol on study date. In this case, the data being # validated is # > in the surgery document and the data it is being validated # against is # > in the protocol document. (In reality all this is pulled out of a # > database on the fly, but the mechanics of how these documents are # > actually created should be more or less irrelevant to the # problem at # > hand?) # > # > First issue: # > # > Writing Schematron asserts can be non-intuitive for a business # > analyst. Consider, for example, a document that reports many lab # > results. We may want to say that the ANC value is between 1000 and # > 10000. As a Schematron assert it is # > essentially: # > # > not(*[local-name() ='ANC']) or ( result_val > 1000 and # result_val < # > 10000) # > # > IE, for things that aren't ANC's we are ok, otherwise check # the result # > value. The problem is that a business analyst just doesn't get the # > "not(x) or" pattern, it might make sense to someone well versed in # > Boolean logic and xpath, but even some of our more experienced # > developers get confused on these rules. # > # > Given this, and the requirement for cross document validation we'd # > like to move the input to our validation process one more step away # > from Schematron and find or create a language that can be # used by the # > business analysts to specify the validation rules in a # manner that is # > a little more natural to them. For example: # > # > element = 'ANC' and result_val > 1000 and result_val < 10000 # > # > For Schematron generation that's pretty straight forward, however, # > more importantly, we also need to be able to use this rule # > specification to tell us how to generate the other document. # > Considering my other example, we want something like: # > # > *[local-name() = 'surgery.date'] > *[local-name() = # > 'protocol.on_study_date'] # > # > Or # > # > *[local-name() = 'surgery.date'] > *[local-name() = # > 'protocol.on_study_date' and protocol.mnemonic = 'TOTXV'] # > # > We want to be able to parse this rule specification to find # the fact # > that we have to do a retrieval of all the protocol data that is in # > context for this particular patient (or the protocol data # in context # > that has a mnemonic='TOTXV'). # > Essentially, I think what we need is an XML syntax for # xpath that we # > can turn back into real xpath or be easily parsed so things # other than # > xpath savvy processors can generate data sets that match the xpath. # > # > We are running this all on top of Apache Cocoon with Saxon # so we more # > or less have any piece of XML or XSLT handling machinery we # might need # > available to us: protocol resolvers, any and all manner of schema, # > XSLT in any version, Java classes, and even Java extensions # for XSLT # > if needed, though I'd rather stay away from those. # > # > Sorry for the windy post, but finally the real questions: # > anyone know of any "obvious" way to do this? By obvious I # mean some # > existing spec, or best practice? If not, any thoughts on # what a good # > structure for our artificial language that is going to be fed into # > Schematron and our # > document retrieval process? Am I missing something with respect to # > Schematron? Could we hook into some underlying part of an xpath # > parser and gain are understanding of the xpath there # instead of at the # > higher level (and thus not need the XML syntax for xpath)? Other # > thoughts or comments? # > # > Peter Hunsberger # > # # # ----------------------------------------------------------------- # The xml-dev list is sponsored by XML.org # <http://www.xml.org>, an initiative of OASIS # <http://www.oasis-open.org> # # The list archives are at http://lists.xml.org/archives/xml-dev/ # # To subscribe or unsubscribe from this list use the subscription # manager: <http://www.oasis-open.org/mlmanage/index.php> #
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|