[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Are we losing out because of grammars?
"K.Kawaguchi" wrote: > > The lesson I draw from this is that it's better to keep these things as > > well separated as possible. > > I see. > > However, "type-assignment" is a quite similar task with validation. In > fact, validator can easily report the type information if it wants to do > so. It's not in general easy, unless you restrict the grammar. For example, consider the following TREX pattern: <element name="x"> <zeroOrMore> <element name="y"> <attribute name="z"> <data type="xsd:string"/> </attribute> </element> </zeroOrMore> <element name="y"> <data type="xsd:integer"/> </element> </element> If I'm in an "x" element and I get a "y" element with a "z" attribute that is a legal lexical representation of an integer, I can't tell whether to type that attribute as an "xsd:integer" or an "xsd:string" unless I lookahead and see whether it's the last element "y" element in the "x". The TREX implementation works on a stream of SAX events, so this is a big complication. > Or, in other words, if one wants to implement a "type-reporter", he/she > is essentially implementing a validator. It depends how you restrict the grammar. If you restrict the grammar as much as W3C's schemas, type assignment is significantly simpler than validation (since I believe I am correct in saying that for W3C schemas the type of an element depends only on its name and the names of its parents). > In yet other words, > > > are separate functions and that mushing the two together is a bad idea: > > I may want to validate without augmenting the infoset and I may want to > > augment the infoset without validating. > > "Validation without type-assignment" is possible, We agree on that. > but "type-assignment > without validation" is not possible. As I indicated above, it depends. > Therefore, in implementation level, validator can (and I think it 'should') incorporate > type-reporter. I would agree with 'can', but not with 'should'. There are many applications for which type-assignment is not necessary; I think dispatching on the "FQGI" (ie on the name of the element and the names of its ancestor elements) is sufficient for many applications. Type assignment may require quite different implementation techniques from validation. > I asked this question because your implementation doesn't incorporate > type-reporting capability. Correct. It's just not something I've ever felt a great need for. I also think there's a huge potential for abuse (as Eric van der Vlist pointed out). I also feel very uneasy about the whole idea of reporting complex (in the W3C XML Schema sense) type names to applications: it feels a bit like in XML exposing the names of parameter entities to the application and I've never heard of anybody asking for that (unless the are writing a DTD editor). Exposing simple types makes a lot more sense to me: that's like asking for the type of an attribute. Now it's my turn to ask you some questions. - You seem to think type-assignment is very important. Why? - Your ambiguity detection algorithm for RELAX detects whether it is possible to assign labels to elements in more than one way. I would find it more interesting to know whether it is possible to assign datatypes (as specified by the RELAX "type" attribute) to leaf elements and attributes in more than one way. Is it possible/easy to detect this kind of ambiguity? James
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|