[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: On Schemas, Namespaces and Syntax vs. Semantics (long but worth it)
I once sat through a presentation where someone pointed out that the worst security flaws typically are caused by situtions where code [or objects/components] is reused outside of its original purpose or design guidelines. Trying to make technology that is immune to incompetence is a loosing battle because the Universe is one step ahead of you creating the one person who will outwit your fool-proof design. The fact that your attempts at foolproofing involve W3C XML Schema already assures me they aren't. As for your responses a.) The only difference between using attributes from a special processing namespace and PIs is that the attributes can be validated while the PIs cannot. Everything else you mentioned is a red herring. b.) Given the complexity of W3C XML Schema and the lack of interoperability between implementations, I tend to avoid solutions that involve having people author more schemas than they have to. c.) Here's one attempt to provide a schema for the no-brainer approach of wrapping the customer data in an element with semantic meaning <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:cust="urn:xmlns:25hoursaday-com:customer" targetNamespace="urn:xmlns:25hoursaday-com:customer" elementFormDefault="qualified"> <xs:complexType name="CustomerActionBase" abstract="true"> <xs:sequence> <xs:element ref="cust:Customer" maxOccurs="unbounded" /> </xs:sequence> </xs:complexType> <xs:element name="Customer" type="cust:CustomerType" /> <xs:complexType name="CustomerType" > <xs:sequence> <xs:element ref="cust:FirstName" /> <xs:element ref="cust:LastName" /> <xs:element ref="cust:PhoneNumber" /> <!-- more stuff goes here like address, etc --> </xs:sequence> <xs:attribute name="customerID" type="xs:integer" /> </xs:complexType> <xs:element name="FirstName" type="xs:string" /> <xs:element name="LastName" type="xs:string" /> <xs:element name="PhoneNumber" type="xs:string" /> <!-- more stuff goes here --> </xs:schema> The CustomerActionBase is an abstract type which means that the various departments can make specific elements referring to what actions to perform on a customer such as loans:collect-loan, loans:send-refund, loans:foreclose, etc and can all share the same definition of customer data. d.) I got your point, I just didn't agree with it. Let's roll this a notch back and look at relational databases since this is where most likely the customer information being stored will come from in the first place. Now it is quite possible that many thousands of departments in hundreds of businesses have a database with the following schema create table foo{ id integer, fname varchar(50) not null, lname varchar(50) not null, -- various fields including address, phone number, etc primary key(id)}; Your protestations about XML Schema being relevant to providing semantics is like saying that sending the results of select queries prepended with the table name is enough to provide all the information necessary to process the data at any endpoint. For instance, imagine the above table was named "students" and was sent via EDI, SOAP, whatever to the financial aid department of a college and was now sitting in a processing queue. Does the fact that the name, ID, etc came from a table containing student information tell the financial aid department whether this message is directing them to disburse the student's loan, cut a refund check, or defer the student's payment obligations? XML Schema are about syntax not about semantics. Thinking otherwise is folly and will lead to creating fundamentally flawed architecture. PS: XML-DEVers, Clemens has indicated to me offlist that he cannot send mail to the list directly since he isn't subscribed but would like me to provide the following link to give more context to the original weblog post that started this thread <http://radio.weblogs.com/0108971/stories/2002/07/23/stayingSaneInAnXmlWe> http://radio.weblogs.com/0108971/stories/2002/07/23/stayingSaneInAnXmlWe <http://radio.weblogs.com/0108971/stories/2002/07/23/stayingSaneInAnXmlWebServicesWorld.html> bServicesWorld.html -----Original Message----- From: Clemens Vasters [mailto:clemensv@n...] Sent: Sat 7/27/2002 11:16 PM To: Dare Obasanjo; xml-dev@l... Cc: rubys@u... Subject: RE: On Schemas, Namespaces and Syntax vs. Semantics (long but worth it) :) In fact, "describing an incompetent organization" is my whole point. What I am saying is that since the folks in that other department didn't think long enough and had no rule written up somewhere that forced them to make the schema their own (because they clearly used the schema for another purpose than designed), the whole situation did happen. I've worked quite a few years on projects that deal with this exact business environment and you wouldn't believe how many terms' meanings are entirely reversed from "giving credit" to "reclaiming bad credit". So, the customer is a bit of a broad example. And, quote me fully: "[Author's note: The "customer" is an illustrative example to carry my point, but there are other, typically much simpler schemas, where such problems are even more likely to occur in the wild]" As to your proposals: (a) Processing instructions are, in my view, things that only a low-level technical infrastructure should ever worry about. If you want processing hints, create an additional schema/namespace and use attributes from that schema (b) xsd:any, xsi:type are things I like there. (c) How do you express that in schema properly? When you have expressed it in schema, how does it make sense? (d) You likely didn't get my point. Both organizations MUST HAVE their own schema, because for both organisations, the "customer" is a wholly different thing. A customer for /corporate credit/ is called a /debtor/ at financial collections and submitted by their type of /customer/. The solution with two different schemas tell you exactly what to do with them , because you need to treat them differently from the start; otherwise your system will not understand.... Same Ruby's blog can be found here: http://radio.weblogs.com/0101679/ Mine can be found here: http://radio.weblogs.com/0108971/ Best Regards, Clemens --------------------- Clemens F. Vasters CTO, newtelligence AG Gilleshütte 99 D-41352 Korschenbroich Germany MSDN Regional Director clemensv@n... v-clevas@m... -----Original Message----- From: Dare Obasanjo [mailto:dareo@m...] Sent: Sunday, July 28, 2002 04:21 To: xml-dev@l... Cc: clemensv@n...; rubys@u... Subject: On Schemas, Namespaces and Syntax vs. Semantics (long but worth it) :) A recent discussion on XML-DEV[0] highlighted the fact that there is some disagreement amongst practitioners of XML as to whether XML schemas and XML namespaces are meant as a means of expressing syntax or both syntax and semantics which is excarberated by the fact that the W3C recommendations describing both of the aforementioned technologies do not offer any guidance. My personal and professional opinion is that neither namespaces nor XML schema offer enough capabilities to specify or even imply semantic differences to any degree beyond the trivial. I recently came across a post by Clemens Vasters[1] which helped condense for me why exactly I am wary of using either of these mechanisms for specifying semantic differences. Below is a condensation excerpted from his post "Assume you do work for a bank. The bank has, of course, multiple departments. The first department is corporate loans. They create an XML Schema to exchange customer information about between the main branch and their satellite branches. The schema only covers the base customer information (name, address, primary contacts, etc.). 'Fine! This Schema fits our need just well to exchange all base information about our customers with our branches, as well' says the financial collections division, Note that the customer term means two different things here. A customer of corporate loans doesn't meet two consecutive payment deadlines. So, the corporate loans department hands over their customer data as a SOAP wrapped document to the "EDI mailbox" of the financial collections business where it is queued for processing. At the same time, the usual swapping of customer data happens between the branches of the financial collections business and get queued for processing in that inbound queue. Now, the customer in the sense of corporate loans is a debtor for financial collections and corporate finance itself is a customer in the sense of financial collections. Now we have two documents with the same namespace but entirely different business semantics stuck in the same queue. What to do?" Sam Ruby offered the following "As to your example, there is no question that two 'customer' entites that are semantically different should be modelled separately. " The funny thing is that I completely disagree with both authors. I disagree with Clemens Vaster for actually claiming that he has posed a problem since what he actually described is an incompetent organization and not a business case that highlights any failings of XML technologies. I disagree with Sam Ruby for a.) implying that using namespaces and/or schemas is data modelling and b.) that this is a data modelling problem. I thought of three possible solutions to this "problem" while reading Vaster's weblog and I'm sure XML-DEVers could think of more. The purpose of posting them to such a wide audience is to dispel the misconception amongst users of XML (especially the web service folks) that XML technologies are a magic pixie dust that can transform poor business processes and bad design decisions into a smooth oiled machine. Sending an XML document containg raw customer data without any extra semantic information is as useless as sending a fax without a cover sheet. Sam Ruby's suggestion [which implies using namespaces to dissambiguate] is akin to suggesting that instead of using coversheets all faxes should have be identified by appending the receiver's email address/office number/name to the end of each document. In most cases this will be sufficient but in any that require additional semantic information there will be no way to convey this. However with a coversheet one can scribble any extra instructions and explanation in the section provided. With my analogy done here are some [of many] potential solutions to this "problem" a.) PI solution: If some executive fiat makes it mandatory that only raw customer data must sent which conforms to the shared schema then processing instructions can be embedded in the SOAP message which then indicate to the final receiver of the message what to do with each customer. b.) Extensible schema solution: Again, if the same executive fiat applies then the original customer data schema is written in such a way as to allow for W3C XML schema extensibility (wildcards, xsi:type, substitution groups, abstract elements and types, redefinition etc) and each customer although conforming to the schema has enough identifying information that it is clear how the data should be processed. c.) No brainer solution: Place the <customer> element information item within other XMl elements which indicate what semantics to use. E.g. loan defaulters are sent as <loan-defaulter> element information item with a <customer> element information item. For completeness I should add the d.) Multiple schema solution: This is Sam Ruby's solution and it involves each division creating a schema for their customer data [although some repetition can be avoided by using a chamaleon schema] then each customer element information item is processed based on its namespace name. This solution however doesn't tell us what to do if one division may send customers to the collections department for different reasons which impact how they are to be treated. I prefer my solutions to Sam Ruby's because they clearly emphasize the "it's just data" aspect of XML instead of embedding accidental semantics where one wants only syntax and data. [0] The thread entitled "Schema Namespace name, schemaLocation, and Schema Versioning" started by Mark Feblowitz earlier this month
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|