Re: Grammars for XPath 2.0: which to use?

Play the video

Subject: Re: Grammars for XPath 2.0: which to use?
From: "Dimitre Novatchev" <dnovatchev@xxxxxxxxx>
Date: Fri, 13 Jul 2007 06:40:20 -0700

Thank you so much.

I will update the grammar I am using.

Although
in the book I was definitely writing for users of the language rather than
parser-writers, I didn't want to depart too far from the published grammar,
so these compound symbols appear as <cast as>, which I think is actually
quite a good compromise, though you need to read the accompanying text to
see that you're actually allowed to have a comment in the middle of it.

This will complicate/change the lexical analysis -- hope just a little.

--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play

On 7/13/07, Michael Kay <mike@xxxxxxxxxxxx> wrote:

I don't think there were many significant grammar changes after the book was
printed. Only one or two minor ones like changing empty() to
empty-sequence(). There may also have been a few clarifications of lexical
rules, for example the fact that (10div 3) is illegal - there must be a
space between "10" and "div". (This question arose with (if($X)then 10else
20) where the "e" can be read as part of a numeric literal).

At the time I wrote the book, the draft spec was still using compound
symbols like <"cast" "as">. These subsequently came out, as a result of a
decision to present a spec that was more a description of the legal
sentences in the language and less a recipe for writing a parser. Although
in the book I was definitely writing for users of the language rather than
parser-writers, I didn't want to depart too far from the published grammar,
so these compound symbols appear as <cast as>, which I think is actually
quite a good compromise, though you need to read the accompanying text to
see that you're actually allowed to have a comment in the middle of it.

Some of the complexity in the spec, especially the Note you reference (which
was at one time part of the spec) arises from XQuery, which adds quite a few
complications to the already-complicated rules for XPath. I think it's true
that in XPath, unlike XQuery, you can tokenize without knowledge of the
grammatical context. The Saxon parser does a "raw" tokenization which for
XPath is essentially context-free, and then adds some processing between the
lexer and the syntax analyzer which essentially classifies tokens more
precisely based on the immediately preceding and following tokens - so
there's a separation between the two traditional tasks of a lexer, splitting
the text into tokens and classifying the tokens. But in other cases, for
example the distinction between "+" as an operator and "+" as an occurrence
indicator, it's left to the syntax analyzer to distinguish them.

Michael Kay
http://www.saxonica.com/


> -----Original Message-----
> From: Dimitre Novatchev [mailto:dnovatchev@xxxxxxxxx]
> Sent: 13 July 2007 05:12
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject:  Grammars for XPath 2.0: which to use?
>
> Recently I've been having fun with parsing context-free
> languages using a general parser for LR languages, written in
> XSLT 2.0.
>
> The first and easier language was JSON, leading to the
> addition of two new functions to FXSL:
>       f:json-document()
> and
>      f:json-file-document()
>
> as reported in this list and in my blog.
>
> The second language I played with was XPath. As I mentioned
> earlier in this list, it was almost straightforward and
> non-problematic to create a working parser (right now
> constructing just a parse tree for an XPath expression). The
> reason for this easiness is that Dr. Kay's XPath 2.0 book is
> an excellent reference material both in describing the
> terminal symbols (lexical tokens) of the language and its grammar.
>
> My question is whether the XPath 2.0 grammar as described in
> the book is still equivalent to the one described in the
> XPath 2.0 recommendation (http://www.w3.org/TR/xpath20/#id-grammar)
>
> or if there are any differences?
>
> Certainly, I could try comparing both grammars myself, but
> why not ask and get this valuable information straight from
> the horse's mouth? I believe this is also valuable to the
> readers of xsl-list.
>
>
> As the official W3 XPath 2.0 recommendation is not so easy to
> read as Dr. Kay's book, I would prefer to be able to continue
> using the grammar from his book (possibly with appropriate
> modifications).
>
> The same question can be asked about the definition of the
> terminal symbols. Here we have:
>
>   1. Dr. Kay's book.
>
>   2. The official W3 XPath 2.0 recommendation
> (http://www.w3.org/TR/xpath20/#terminal-symbols)
>
>   3. A seemingly outdated W3 document "Building a Tokenizer
> for XPath or XQuery" (http://www.w3.org/TR/xquery-xpath-parsing/)
>
> In implementing the lexical scanner (again in pure XSLT 2.0)
> I again used Dr. Kay's book (1), found (2) quite confusing,
> and definitely decided not to use any of the approaches
> described in (3). It might be interesting to know that
> determining the next terminal symbol can be accomplished
> based on a the evaluation of a single regular expression
> (shall I call this "one-pass approach" ?).
>
> --
> Cheers,
> Dimitre Novatchev
> ---------------------------------------
> Truly great madness cannot be achieved without significant
> intelligence.
> ---------------------------------------
> To invent, you need a good imagination and a pile of junk
> -------------------------------------
> You've achieved success in your field when you don't know
> whether what you're doing is work or play

Current Thread
Grammars for XPath 2.0: which to use? Dimitre Novatchev - 13 Jul 2007 04:12:30 -0000 Michael Kay - 13 Jul 2007 08:46:52 -0000 Michael Kay - 13 Jul 2007 09:06:02 -0000 Dimitre Novatchev - 13 Jul 2007 13:40:44 -0000 <=

<- Previous	Index	Next ->
RE: Grammars for XPath 2.0: w, Michael Kay	Thread	Clientside XSLT Transformatio, Karl Stubsjoen
Re: > replaced by ">", &lt, Jethro Borsje	Date	Re: Clientside XSLT Transform, Karl Stubsjoen
	Month

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >