[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Another look at namespaces
----- Original Message ----- From: Simon St.Laurent <simonstl@s...> > >You don't actually need the "vocabulary". The alphabet of a formal language > >is part of the grammar. > > In XML-based languages that rely on DTDs or schemas, yes. But in all > formal languages? Yes. The grammar includes the symbols it uses. > Seems that it wouldn't be hard to create a formal > language that had classes of vocabulary (like noun, verb, adjective) and > fit them into patterns (subject[noun]-verb[verb]-object[noun]) that were > separate. This separation is merely partitioning the grammar into productions that take penultimate symbols to terminal symbol and all the other productions. Eg [1] Sentence -> NP VP [2] VP -> V NP [3] NP -> Simon [4] NP -> XML [5] V -> likes What you are talking about is splitting productions 3-5 from 1-2. This is often done in natural language processing and many theories of (natural) language make a distinction between the lexicon and the syntactic rules. But we are talking about formal languages, not natural languages. [..] > It's that, but it's also worse. Suppose you have a nice modular DTD that > expresses most of the vocabulary a user will need to create documents of a > certain type, but has ANY sections so that users can organize it any way > they like. Users build sets of DTDs to see what exactly it is they're > getting or producing, but all of the possibilities are actually open. Is > the language described by the 'master' DTD, which doesn't get you very far? > Or is the language described by the particular DTDs? Or do we measure > interoperability? A 'master DTD' containing all possibilities will quickly > grow obese. I'm not sure I understand what you are saying here. When a user pieces together bits of different DTDs, they end up with a *single* DTD. This is a single grammar definining a single set of valid instances. > Then there's the simpler case of well-formed documents, for which we can > _derive_ grammars, but can't make definitive statements above the level of > XML 1.0 conformance. Pardon? A grammar for well-formed documents doesn't need to be derived because it is in the XML 1.0 REC. It is a BNF augmented by WFCs and the odd bit of prose. [...] > I think 'formal language' in that sense is not especially useful except for > limited situations, and should probably be reserved for the few cases where > XML development is limited to representations of older legacy systems that > relied on formal languages based on that sense. XML itself, it seems, can > do better than that. It can. But formal languages are part of the picture because sometimes there are syntactic constraints. They might be loose, but they are still a grammar. > It depends on what kind of 'formalizing' you want to do. In many cases, > I'd suggest that we focus on 'relaxing', producing more flexible models > that aren't so concerned about locking everything down into a single > grammar and a single vocabulary. It requires a change of mindset. A formal grammar is still a formal grammar even if it permits any of the terminal symbols in any order. A more flexible model is still a model. The moment you model the syntax, you have a formal grammar. > Why is it that only one validating Java parser allows the application to > continue after a validity constraint (not a well-formedness constraint) has > been violated? Because the others are wrong. > I suspect it's because a lot of folks are taking the 'formal grammar' of DTDs more seriously than the XML 1.0 > spec itself does... But that has nothing to do with the value of formal grammars. If I present you with a CFG modelling English and refuse to listen to you unless your sentences parse to my CFG, that isn't a problem with my CFG *or* the notion of CFGs in general. > I don't think we're incompatibly far apart I actually agree with you completely in pretty much everything but terminology. > I just would like folks to look at 'formal languages' a bit more closely and a bit more critically. Rick > Jelliffe's made excellent arguments in other postings on this thread, for example, regarding the ways formal > languages can obscure as well as illuminate. Right now, I think we need to contemplate whether 'formal > grammars' sufficiently distinguish 'languages' in practice before putting extra work > for programmers and authors (namespaces) on every formal grammar that comes > our way. I think the XML community would generally agree that: 1. certain classes of formal grammar are not sufficient for the syntactic constraints people wish to express 2. syntax isn't all there is Linguists worked these out well before you and I were born, Simon :-) I think SGMLers did too which is one of the reasons that a Document Type Definition in SGML includes semantics as well as syntax (see another post where I follow on from Rick's comments relating to this) As far as I can tell, no one is arguing that formal grammars are all we need. I am merely trying to clarify what formal grammars are so that people understand what is meant when someone says that a language has a grammar or that a DTD is a grammar. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|