[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Cost of catching data errors at the periphery during XMLvalidation v

  • From: William Velasquez <wvelasquez@visiontecnologica.com>
  • To: "Costello, Roger L." <costello@mitre.org>, "xml-dev@l..."<xml-dev@l...>
  • Date: Fri, 10 Oct 2014 16:38:12 +0000

RE: Cost of catching data errors at the periphery during XMLvalidation v
I developed a system currently in production, where a requisite was to allow empty xs:datetime to be stored and documents be valid against an XSD Schema.

The client component (an old version of Altova Authentic) wasn't able to handle xsi:nil correctly, and after a lot of tries, the decision was to temporally disable schema validation, to allow the users enter the empty date times until we find a solution.

Well, the final solution was never found, the users made thousands of data mistakes, and now, the customer spends every year (during the last 5 years) an amount of 50% of the initial project cost finding and correcting data mistakes (a total overcost of 250% in 5 years, and it could take almost other 2 years). 

Roger, thanks for giving me the chance to make this public confession. My soul was slave of the sin of allowing the customer to take dumb decisions.

- Bill

-----Mensaje original-----
De: Costello, Roger L. [mailto:costello@mitre.org] 
Enviado el: viernes, 10 de octubre de 2014 9:27 a. m.
Para: xml-dev@lists.xml.org
Asunto:  Cost of catching data errors at the periphery during XML validation vice inside the system, during application processing?

Hi Folks,

Have you done a cost analysis of catching and dealing with data errors during XML validation (at the periphery, before the data has gone into the system) versus catching and dealing with data errors during application processing (after the data has gotten inside the system)?

I am guessing that the relative costs are something like this:

Cost during XML validation: $1
Cost during application processing: $1000

That is, it's a thousand times more expensive to deal with erroneous data once it's gotten inside the system than to catch and deal with errors at the periphery, via XML Schema (or Relax NG schema or Schematron schema) validation. I am making up those figures; I want to know if someone has real figures.

Here's a scenario to illustrate:

A client creates an XML document and sends it to a web service. The XML document contains data about an aircraft, including its takeoff weight. Due to a bug in the client's software (or perhaps the error is a malicious attack), the XML document contains this:

	<TakeoffWeight units="pounds">-12000</TakeoffWeight>

The value for takeoff weight is a negative value, which is clearly an error.

The web service receives the XML document and (at the periphery of the system) validates it against an XML Schema. The schema has a well-constrained declaration for TakeoffWeight:

<xs:element name="TakeoffWeight">
            <xs:extension base="weightType">
                <xs:attribute name="units" use="required">
                        <xs:restriction base="xs:string">
                            <xs:enumeration value="pounds" />
                            <xs:enumeration value="kilograms" />

<xs:simpleType name="weightType">
    <xs:restriction base="xs:decimal">
        <xs:minInclusive value="0" />
        <xs:maxInclusive value="100000" />
        <xs:fractionDigits value="2" />

Schema validation detects the error and the web service immediately deals with the error (which might include dropping the XML document on the floor).

Alternatively, suppose the web service receives the XML document and validates it against an unconstrained XML Schema, which has this:

<xs:element name="TakeoffWeight" type="xs:decimal" />

Or worse, the web service doesn't do any validation and immediately hands the XML document off to an application to process it.

The data error is not immediately detected and the erroneous XML document insidiously winds its way deeper and deeper into the web service's system. At what cost?



XML-DEV is a publicly archived, unmoderated list hosted by OASIS to support XML implementation and development. To minimize spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.