[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Are we losing out because of grammars?

  • From: James Clark <jjc@j...>
  • To: "K.Kawaguchi" <k-kawa@b...>
  • Date: Thu, 01 Feb 2001 15:50:44 +0700

Re: Are we losing out  because of grammars?
"K.Kawaguchi" wrote:
 
> > The lesson I draw from this is that it's better to keep these things as
> > well separated as possible.
> 
> I see.
> 
> However, "type-assignment" is a quite similar task with validation. In
> fact, validator can easily report the type information if it wants to do
> so.

It's not in general easy, unless you restrict the grammar.  For example,
consider the following TREX pattern:

<element name="x">
  <zeroOrMore>
    <element name="y">
      <attribute name="z">
        <data type="xsd:string"/>
      </attribute>
    </element>
  </zeroOrMore>
  <element name="y">
    <data type="xsd:integer"/>
  </element>
</element>

If I'm in an "x" element and I get a "y" element with a "z" attribute
that is a legal lexical representation of an integer, I can't tell
whether to type that attribute as an "xsd:integer" or an "xsd:string"
unless I lookahead and see whether it's the last element "y" element in
the "x".   The TREX implementation works on a stream of SAX events, so
this is a big complication.

> Or, in other words, if one wants to implement a "type-reporter", he/she
> is essentially implementing a validator.

It depends how you restrict the grammar.  If you restrict the grammar as
much as W3C's schemas, type assignment is significantly simpler than
validation (since I believe I am correct in saying that for W3C schemas
the type of an element depends only on its name and the names of its
parents).
 
> In yet other words,
> 
> > are separate functions and that mushing the two together is a bad idea:
> > I may want to validate without augmenting the infoset and I may want to
> > augment the infoset without validating.
> 
> "Validation without type-assignment" is possible,

We agree on that.

> but "type-assignment
> without validation" is not possible.

As I indicated above, it depends.
 
> Therefore, in implementation level, validator can (and I think it 'should') incorporate
> type-reporter.

I would agree with 'can', but not with 'should'. There are many
applications for which type-assignment is not necessary; I think
dispatching on the "FQGI" (ie on the name of the element and the names
of its ancestor elements) is sufficient for many applications.  Type
assignment may require quite different implementation techniques from
validation.

> I asked this question because your implementation doesn't incorporate
> type-reporting capability.

Correct.  It's just not something I've ever felt a great need for.  I
also think there's a huge potential for abuse (as Eric van der Vlist
pointed out). I also feel very uneasy about the whole idea of reporting
complex (in the W3C XML Schema sense) type names to applications: it
feels a bit like in XML exposing the names of parameter entities to the
application and I've never heard of anybody asking for that (unless the
are writing a DTD editor).  Exposing simple types makes a lot more sense
to me: that's like asking for the type of an attribute.

Now it's my turn to ask you some questions.

- You seem to think type-assignment is very important.  Why?

- Your ambiguity detection algorithm for RELAX detects whether it is
possible to assign labels to elements in more than one way. I would find
it more interesting to know whether it is possible to assign datatypes
(as specified by the RELAX "type" attribute) to leaf elements and
attributes in more than one way.  Is it possible/easy to detect this
kind of ambiguity?

James


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.