[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Pattern-Matching / Regular Expression Types

Subject: Re: Pattern-Matching / Regular Expression Types
From: Dimitre Novatchev <dnovatchev@xxxxxxxxx>
Date: Thu, 26 Apr 2012 15:39:01 -0700
Re:  Pattern-Matching / Regular Expression Types
There is a LR-1 generic, table-driven parser in FXSL --anyone is
welcome to use it.

Dimitre.

On Thu, Apr 26, 2012 at 3:28 PM, Michael Kay <mike@xxxxxxxxxxxx> wrote:
>
> You could try taking a look at Gunther Rademacher's REX parser generator.
> I've found it hard to find information about it, other than mentions by
> people who have used it for some rather interesting projects. Basically, if
> I understand it correctly, given an EBNF grammar, it generates a parser for
> that grammar written in XQuery. Most of the examples seem to be parsers for
> textual languages (i.e. where the tokens of the language being parsed are
> made up of characters) but I don't see any reason in principle why it
> shouldn't also parse a language where the tokens of the language are
element
> nodes.
>
> Michael Kay
> Saxonica
>
>
> On 26/04/2012 22:45, Tiago Freitas wrote:
>>
>> I need to match patterns on a set of XML documents (all with the same
>> schema), and when a pattern matches, I need to retrieve the content
>> and do some specific transformations on that content (no xml output
>> needed).
>>
>> Specifically, they are natural language syntactic trees (and
>> dependencies).
>>
>> I will have a list of those "patterns", that are similar to regular
>> expressions, but with elements and attributes.
>>
>> pseudo-pattern example:
>>
>> (//ELEMENTx) (node())* (//ELEMENTy[@ATTRIBUTEz]) (node())*
>> (//@ATTRIBUTEw)
>>
>> I used XPath syntax inside the parenthesis only. Other quantifiers
>> could be used...and also specify dependencies between
>> nodes/attributes, but that is another problem.
>>
>> This example would match when the xml has ELEMENTx as the first
>> element, ends with one element that has ATTRIBUTEw, and in between
>> needs to have an ELEMENTy with ATTRIBUTEz.
>>
>> Note that I need to match the whole document for each pattern, not
>> just part of it.
>>
>> The nesting of elements does not matter in this case (ELEMENTy could
>> be a child of ELEMENTx, or not), but they need to have that specific
>> order (in document order).
>>
>> Example of tree that can appear:
>> TOP
>> B  / \
>> X B  Y
>> | \ B  | \
>> 1 2 3 4
>>
>> Matching patterns could be (node names, assuming no attributes):
>> X Y
>> 1 * Y
>> X 3 4
>> 1 * 4
>>
>> I could use XPath to get each individual node in the pattern, but then
>> I loose the order...if I do two XPath queries, I don't know the
>> positions of the results relative to each other.
>>
>> After matching, I will have rules for each pattern, that specify some
>> transformations on the content (change order, etc).
>>
>> Is there any way to do something like this using XSL, XQuery, or other
>> language? (preferably available in a Java implementation)
>>
>> Thanks for any pointers.
>> (Is it ok to cross-post this to an XQuery list? Recommend any?)
>



--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.