[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Datatypes - it's in the contract
9/28/2002 7:50:50 AM, Jonathan Robie <jonathan.robie@d...> wrote: > >If the applications that use this data require data of the appropriate >type, and we want validation to be able to determine whether the contract >is being followed, then we have to allow data types to be declared. I think there are a number of problems with taking such types all that seriously for real *XML*-centric applications, even accepting the (pretty reasonable!) argument that the schema should define a contract between producers and consumers of data. (I wouldn't quarrel with using types extensively in OO programming languages, nor in exploiting SQL types in SQL-centric programs; I simply think that XML has other use cases and design patterns than these technologies. Disagree? That's another thread!) First, a schema that handled your example data in a truly useful way would be non-trivial at best (or some non-trivial code would be needed to pre-preprocess data to meet it). Think of instances such as <person> <ssn>123-456-789</ssn> <name>THX-1135</name> <children>3.0</children> </person> <person> <ssn>123 456 789</ssn> <name>[none of your business]</name> <children>three</children> </person> Second, think of data that simply can't be validated by syntax. For example: <prime-number-public-key>120349812304897210349876786238746</prime-number-public-key> <customer-id>666-1313-0000<customer-id> Ain't no way a schema validator is going to enforce the contract that those be valid prime numbers, customer-ids., etc. If some procedural code has to be invoked to do that anyway, how much more trouble is it to have the procedural code check to see that the syntax is correct ... or to write validation code that doesn't worry about variant syntaxes such as 6661313000 or 666 1313 000 or 6 6 6 1 3 1 3 0 0 0 0 ad infinitum... not to mention "six six six one three one three zero zero zero" If that data comes from humans, &deity; only knows how many creative ways people can find to enter meaningful but syntactically invalid data, and I for one would find it vastly easier to write code to validate a reasonable range of these than to put this stuff in an XML schema. Ultimately, a human is going to have to look at the input in some significant percentage of the cases, and systems designers have to figure out where to draw the line beween trying to write code (either procedural or declarative queries/schems) to handle the weird cases and simply punting to a human. (Ahem, the option of "we don't want your money until you enter the data to our exacting standards" appeals to nerds a LOT more than it appeals to Pointy Haired Bosses!). So sure, a "contract" specifying the format of data is useful (more useful for design, negotiation, and debugging purposes than for run-time validation IMHO). Doing as much as feasible at the syntactic level with regular expressions / schema / etc. makes a lot of sense in *many* circumstances, so sure, people should be encouraged to use these features WHEN they solve their problems "out of the box." But in many (most?) real-world situations there is no XML formalism to to define the contractual constraints appropriately, and the contract must include natural language descriptions, references to mathematical concepts ("primeness"), database relationships ("the customer id must exist in the database, the customer record it identifies must match the information supplied in the order"). The complaint, basically, is that a vastly disproportionate amount of the W3C's effort has been spent moving from what would be an "80%" solution (roughly what one can do with RELAX NG, perhaps) to a "90%" solution (maybe 95% ... let's not quibble ... it's very significantly under 100%). This relatively small increase in the actual practical effectiveness of the strongly-typed approach over a more weakly-typed approach does not justify, in the opinion of many who post here, the immense amount of complexity it has added to WXS and XQuery, the difficulty that has caused implementers and end users, not to mention the years added to the time it takes to get the specs to Recommendation status. So, few would disagree that "it's in the contract". Lots would disagree that the amount of effort/complexity added to XML++ to validate the "contract" with schema-based mechanisms is worth the cost.
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|