[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Another look at namespaces

  • From: "James Tauber" <jtauber@j...>
  • To: "XML-Dev Mailing list" <xml-dev@i...>
  • Date: Mon, 20 Sep 1999 21:18:10 +0800

cfg for english
----- Original Message -----
From: Simon St.Laurent <simonstl@s...>
> >You don't actually need the "vocabulary". The alphabet of a formal
language
> >is part of the grammar.
>
> In XML-based languages that rely on DTDs or schemas, yes.  But in all
> formal languages?

Yes. The grammar includes the symbols it uses.

> Seems that it wouldn't be hard to create a formal
> language that had classes of vocabulary (like noun, verb, adjective) and
> fit them into patterns (subject[noun]-verb[verb]-object[noun]) that were
> separate.

This separation is merely partitioning the grammar into productions that
take penultimate symbols to terminal symbol and all the other productions.

Eg
    [1] Sentence -> NP VP
    [2] VP -> V NP
    [3] NP -> Simon
    [4] NP -> XML
    [5] V -> likes

What you are talking about is splitting productions 3-5 from 1-2. This is
often done in natural language processing and many theories of (natural)
language make a distinction between the lexicon and the syntactic rules. But
we are talking about formal languages, not natural languages.

[..]
> It's that, but it's also worse.  Suppose you have a nice modular DTD that
> expresses most of the vocabulary a user will need to create documents of a
> certain type, but has ANY sections so that users can organize it any way
> they like.  Users build sets of DTDs to see what exactly it is they're
> getting or producing, but all of the possibilities are actually open.  Is
> the language described by the 'master' DTD, which doesn't get you very
far?
>  Or is the language described by the particular DTDs?  Or do we measure
> interoperability?  A 'master DTD' containing all possibilities will
quickly
> grow obese.

I'm not sure I understand what you are saying here. When a user pieces
together bits of different DTDs, they end up with a *single* DTD. This is a
single grammar definining a single set of valid instances.

> Then there's the simpler case of well-formed documents, for which we can
> _derive_ grammars, but can't make definitive statements above the level of
> XML 1.0 conformance.

Pardon? A grammar for well-formed documents doesn't need to be derived
because it is in the XML 1.0 REC. It is a BNF augmented by WFCs and the odd
bit of prose.

[...]
> I think 'formal language' in that sense is not especially useful except
for
> limited situations, and should probably be reserved for the few cases
where
> XML development is limited to representations of older legacy systems that
> relied on formal languages based on that sense.  XML itself, it seems, can
> do better than that.

It can. But formal languages are part of the picture because sometimes there
are syntactic constraints. They might be loose, but they are still a
grammar.

> It depends on what kind of 'formalizing' you want to do.  In many cases,
> I'd suggest that we focus on 'relaxing', producing more flexible models
> that aren't so concerned about locking everything down into a single
> grammar and a single vocabulary.  It requires a change of mindset.

A formal grammar is still a formal grammar even if it permits any of the
terminal symbols in any order. A more flexible model is still a model. The
moment you model the syntax, you have a formal grammar.

> Why is it that only one validating Java parser allows the application to
> continue after a validity constraint (not a well-formedness constraint)
has
> been violated?

Because the others are wrong.

> I suspect it's because a lot of folks are taking the 'formal grammar' of
DTDs more seriously than the XML 1.0
> spec itself does...

But that has nothing to do with the value of formal grammars. If I present
you with a CFG modelling English and refuse to listen to you unless your
sentences parse to my CFG, that isn't a problem with my CFG *or* the notion
of CFGs in general.

> I don't think we're incompatibly far apart

I actually agree with you completely in pretty much everything but
terminology.

> I just would like folks to look at 'formal languages' a bit more closely
and a bit more critically.  Rick
> Jelliffe's made excellent arguments in other postings on this thread, for
example, regarding the ways formal
> languages can obscure as well as illuminate. Right now, I think we need to
contemplate whether 'formal
> grammars' sufficiently distinguish 'languages' in practice before putting
extra work
> for programmers and authors (namespaces) on every formal grammar that
comes
> our way.

I think the XML community would generally agree that:

1. certain classes of formal grammar are not sufficient for the syntactic
constraints people wish to express
2. syntax isn't all there is

Linguists worked these out well before you and I were born, Simon :-)
I think SGMLers did too which is one of the reasons that a Document Type
Definition in SGML includes semantics as well as syntax (see another post
where I follow on from Rick's comments relating to this)

As far as I can tell, no one is arguing that formal grammars are all we
need. I am merely trying to clarify what formal grammars are so that people
understand what is meant when someone says that a language has a grammar or
that a DTD is a grammar.

James




xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.