|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Quiz: XML flexibility
At 2005-02-28 00:05 +0200, Razvan MIHAIU wrote: >... > Questions like this are asked *today* in an XML certification exam > like IBM 141. >... > This is why I am worried about this exam: on many occasions I found > out that >the question's text did not provide enough information. In such situations you >just have to try to guess what the exam writer was thinking and as a >result you are guessing the answer too. That is so true and hasn't changed. > I will try to clarify the question. No need for me, I understood your question and I think there is no "correct" answer. >The answer from this quiz suggests that >flexibility can only be achieved by using mixed content models. Should I just >discard this (as Michael Kay suggested) or there is some true behind this >affirmation ? But flexibility at what cost? When you introduce mixed content, how can you constrain the input to be desired sequences? How can you prevent unnecessary and incorrect text between the elements you want in sequence? Mixed content has a role and it is not opening up a model for "flexibility" while abandoning the ability to constrain to new business requirements ... which defeats the purpose of doing document modeling. > > Document models described by W3C Schema are extensible only adding new >constructs to the end of content models of existing constructs ... to me this >is not very flexible at all ... I might want to put new information in the >middle of a content model. > > You can achieve this with mixed content models. I disagree. You cannot constraint your new documents to your new business requirements if you throw everything into mixed content. You cannot add flexibly any new requirements for constrained validation if you utilize a modeling semantic (mixed content) that does not constrain. >With this you can add as many elements as you want where you want. That's the problem. I may wish to constrain a new document model to be the old document model with a constrained some new elements not found only at the end of the old document model. Mixed content is not a panacea ... it has its role, and using it for "flexibility" is not it. So, of the given four answers, I believe none of them are correct and I would challenge arguments that any one is ... have you been told which answer the poser of the question believes is correct? > > Document models described by RELAX-NG are extensible merely by > producing the >union of *any* two other document models ... to me this is *very* flexible. I >can accommodate old and new instances of any vocabulary in this simple fashion >by creating the new union vocabulary. > > > > Running a query on a document modeled by W3C Schema can produce an instance >result that cannot be modeled by W3C Schema mechanically by a machine analysis >of the schema expression. Consider a document where the element "p" is >modelled one way in one context using a given type, and modelled another >way in >another context using a different type, and the query returns an instance >of all >"p" elements ... since sibling "p" elements in W3C Schema cannot have >different >content models, I cannot machine-generate a W3C Schema expression of the model >of this result, and I'm obliged to do so by hand without the benefit of >co-occurrence constraints to help. To me this is not very flexible. > > > I did not read about XQuery yet. I did not say anything about XQuery ... all I said is "running a query" ... there are many languages that will give you the result of querying a subset from another document. >"Fragments", "XQuery" and "XLink" are my >targets for the next week. Please don't get distracted from my point ... I'm only talking about obtaining the result of asking for the content from an XML document. I didn't mention any standards for having obtained that content ... it is irrelevant to my point. >I just want to say that if your "p" is found in the >same XML instance then it must be in different namespaces otherwise the >document would be invalid. This is not true from an instance perspective. Validity is measured by the meeting of expressed constraints using the semantics employed by a validation language. Namespaces have nothing to do with my point. >Since they are in different namespaces it seems >logical that there is a way to differentiate between them using their >namespaces. I will not comment on this any further because I still have to >read about XQuery. No, you don't. DTD validation semantics do not allow an element by name to have two different constraints on the content model in the same document. W3C Schema validation semantics allow two elements by name that are not siblings to have two different constraints on the content model in the same document. RELAX-NG validation semantics allow two elements by name to have two different constraints on the content model anywhere in the same document. >After that, maybe, your words will have a new meaning to me. Please don't get distracted by fragments, query, or link ... that is irrelevant to my point about flexibility and about using one expression of constraints and a mechanical derivation of a new set of constraints from old constraints. By "mechanical", I mean without human intervention and without considerable heuristics that would mimic human intervention ... I just mean the simple introduction of a choice that could be easily mechanized. > In seems that Relax NG is very popular in certain circles. However the >author of "Professional XML" states that Relax NG is not meant to replace XML >Schema: > >"All of these proposals might be seen as providing lighter-weight alternatives >to an implementation using XML Schemas. None of them (except perhaps DSD) are >intended to replace XML Schema since it has many capabilities that are not >present in these other proposals." > > The author was speaking about DSD (Document Structure Description), RELAX >(Regular Language for XML), TREX (Tree Regular Expressions for XML) and >Schematron. RELAX-NG and Schematron are ISO standards in the DSDL (Document Schema Definition Languages) family - ISO/IEC 19757 (parts 2 and 3) that address different requirements for XML document validation semantics and the expressions of constraints. Different problems need solving with different expressions of constraints and someone who needs to model a document needs to use the expression language that satisfies the modeling semantics they need. W3C Schema is a set of type-based constraint expression semantics. RELAX-NG is a set of grammar-based constraint expression semantics. Schematron is a set of assertion-based constraint expression semantics. Anyone who tells you one constraint expression language is trying to replace another is, in my opinion, not expressing fairness nor understanding the role that constraint expressions can play in validating XML documents. Each of the above has plusses and minuses. Everyone should measure what their requirements are for expressing constraints and then choosing the appropriate language that has the required validation semantics. Vendors who tell you there is only one sanctioned schema language for XML are purposely misleading you. Choose the one that meets your needs. Choose different ones for different documents if you need more than one. Choose different ones for the same document if the document life cycle dictates changing needs for validation. You asked about flexibility and I was commenting on areas where W3C Schema is not as flexible as RELAX-NG. Choosing to use a set of flexible validation semantics will make your XML flexible for the future. Choosing to use a set of validation semantics that are not flexible will box you in and keep you from expressing what you need in the future. Throwing things into mixed content doesn't give one any ability to constraint the sequence and order of those constructs, so I am unable to use this putative "extension mechanism" to meet my business objectives. It, in my opinion, does not make my XML at all flexible ... just gives it the opportunity to be messy without being able to constrain it. I think it is a very wrong answer to making something "flexible". > > So, to me, these are the kinds of questions to ask to deduce the nature of >"what is flexible XML?" ... not anyone's particular choice of a given >vocabulary. > > You are basically saying that there are no special design decisions > to make >when you want to design a "flexible and open to future changes" XML document. Yes. That is my point. Anyone who uses XML will face the need to address changes, and choosing to use a set of validation semantics that accommodates change will best serve their needs. One cannot anticipate all the possible changes that might be needed. > A second thought: the vocabulary may not be important but the way you >declare the relationships between elements could impact the future >extensibility of your documents. Indeed. Absolutely. That is my point. >I will think about this. I'm glad to hear this ... and please don't think that any one expression of constraints is the be all and end all of constraint expression. Choose the one that meets your business needs, your technology needs, your training needs, and your comprehension ... hopefully one language will meet all these for you but if you need more than one, they are all there for the taking. Note the DSDL title is plural "Document Schema Definition Languages" ... this project explicitly assumes that there are many schema expression languages for many purposes and that they should all work together to the best of their respective abilities. Now that RELAX-NG and Schematron are both ISO standards, I anticipate more industry (read "vendor") acceptance. Back to my example of a generic query, below is an XML document named "query.xml" ... in it are two different kinds of <p>, one in the context of <a> and the other in the context of <b>. I query the document (by whatever means) and I get "queryres.xml" ... in it are all of the <p> elements as siblings. The original "query.xml" validates using the constraints expressed in "query.wxs". The constraints "queryaut.wxs" could be derived mechanically by introducing a choice around both kinds of <p> elements, but you can see that W3C Schema constraint semantics do not allow this automatically-generated schema expression to validate the result document. The hand-authored "queryres.wxs" does validate the result just fine, but I had to introduce the choice with some thought, not with a simple union choice of the two kinds of <p>. Human intervention (or a lot of heuristics I wouldn't want to have to program) is required because of the restrictions of the constraint semantics in W3C Schema. This is not true in RELAX-NG where I can express below in "queryres.rnc" the simple choice between two different kinds of <p> as siblings. I've run MSV and Jing in the examples below to confirm my results. Remember I introduced this as an argument regarding flexibility ... choosing W3C Schema is not flexible in some ways because it does not allow mixing sibling elements of the same name with different content models. As business requirements change, content models change, and it could be very easy for an evolving document model to need to accommodate two elements of the same name but different content models. Such "growth" in validation requirements is quite flexibly met in the RELAX-NG validation semantics, as illustrated below. I hope you find this helpful. ........................... Ken R:\samp>type query.xml <?xml version="1.0" encoding="iso-8859-1"?> <doc> <a> <p> <c/><d/> </p> </a> <b> <p> <c/><e/> </p> </b> <a> <p> <c/><d/> </p> </a> </doc> R:\samp>call msv query.wxs query.xml No validation errors. R:\samp>type queryres.xml <?xml version="1.0" encoding="iso-8859-1"?> <doc> <p> <c/><d/> </p> <p> <c/><e/> </p> <p> <c/><d/> </p> </doc> R:\samp>call msv queryaut.wxs queryres.xml start parsing a grammar. validating queryres.xml Error at line:4, column:13 of file:///R:/samp/queryres.xml tag name "d" is not allowed. Possible tag names are: <e> Error at line:10, column:13 of file:///R:/samp/queryres.xml tag name "d" is not allowed. Possible tag names are: <e> the document is NOT valid. R:\samp>call msv queryres.wxs queryres.xml No validation errors. R:\samp>jing -c queryres.rnc queryres.xml R:\samp>type query.wxs <?xml version="1.0" encoding="utf-8"?> <wxs:schema xmlns:wxs="http://www.w3.org/2001/XMLSchema"> <wxs:element name="doc"> <wxs:complexType> <wxs:choice maxOccurs="unbounded"> <wxs:element name="a"> <wxs:complexType> <wxs:sequence> <wxs:element name="p" maxOccurs="unbounded"> <wxs:complexType> <wxs:sequence> <wxs:element name="c"> <wxs:complexType/> </wxs:element> <wxs:element name="d"> <wxs:complexType/> </wxs:element> </wxs:sequence> </wxs:complexType> </wxs:element> </wxs:sequence> </wxs:complexType> </wxs:element> <wxs:element name="b"> <wxs:complexType> <wxs:sequence> <wxs:element name="p" maxOccurs="unbounded"> <wxs:complexType> <wxs:sequence> <wxs:element name="c"> <wxs:complexType/> </wxs:element> <wxs:element name="e"> <wxs:complexType/> </wxs:element> </wxs:sequence> </wxs:complexType> </wxs:element> </wxs:sequence> </wxs:complexType> </wxs:element> </wxs:choice> </wxs:complexType> </wxs:element> </wxs:schema> R:\samp>type queryaut.wxs <?xml version="1.0" encoding="utf-8"?> <wxs:schema xmlns:wxs="http://www.w3.org/2001/XMLSchema"> <!--the following automated choice between content models doesn't work as validation triggers on only the first definition of <p> --> <wxs:element name="doc"> <wxs:complexType> <wxs:choice maxOccurs="unbounded"> <wxs:element name="p" maxOccurs="unbounded"> <wxs:complexType> <wxs:sequence> <wxs:element name="c"> <wxs:complexType/> </wxs:element> <wxs:element name="e"> <wxs:complexType/> </wxs:element> </wxs:sequence> </wxs:complexType> </wxs:element> <wxs:element name="p" maxOccurs="unbounded"> <wxs:complexType> <wxs:sequence> <wxs:element name="c"> <wxs:complexType/> </wxs:element> <wxs:element name="d"> <wxs:complexType/> </wxs:element> </wxs:sequence> </wxs:complexType> </wxs:element> </wxs:choice> </wxs:complexType> </wxs:element> </wxs:schema> R:\samp>type queryres.wxs <?xml version="1.0" encoding="utf-8"?> <wxs:schema xmlns:wxs="http://www.w3.org/2001/XMLSchema"> <!--the following hand-crafted version works just fine--> <wxs:element name="doc"> <wxs:complexType> <wxs:sequence> <wxs:element name="p" maxOccurs="unbounded"> <wxs:complexType> <wxs:sequence> <wxs:element name="c"> <wxs:complexType/> </wxs:element> <wxs:choice> <wxs:element name="e"> <wxs:complexType/> </wxs:element> <wxs:element name="d"> <wxs:complexType/> </wxs:element> </wxs:choice> </wxs:sequence> </wxs:complexType> </wxs:element> </wxs:sequence> </wxs:complexType> </wxs:element> </wxs:schema> R:\samp>type queryres.rnc start = element doc { ( element p { element c { empty }, element d { empty } } | element p { element c { empty }, element e { empty } } )+ } # end of file -- World-wide on-site corporate, govt. & user group XML/XSL training. G. Ken Holman mailto:gkholman@C... Crane Softwrights Ltd. http://www.CraneSoftwrights.com/x/ Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (F:-0995) Male Breast Cancer Awareness http://www.CraneSoftwrights.com/x/bc Legal business disclaimers: http://www.CraneSoftwrights.com/legal
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||






