[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: text files & xsd & regex


xsd regex
Two answers:

1) SGML. It allows you to specify regular expressions
(content models) together with the delimiters used,
to read in text, parse it to SGML, then output as events.
If you have many documents with lots of these, run them
through an SGML processor.  

2) Check out Xpath2. It seems that it will have some kind of
syntax for this kind of thing, see 
http://www.w3.org/TR/xquery-operators/#func-matches
and the reference to captured substrings.   I guess the
most consistant thing would be to follow them in some way.

3) Probably ISO DSDL will have something like this.
In particular, which features in addition to Regular Fragmentations
are you interested in?

4) In the meantime, you can tokenize many kinds of strings 
and check them for various constraints using Schematron,
which can be embedded in <appinfo> now and extracted
using a stylesheet. That does not give you full regular 
expressions.  (Schematron 1.6 will be out within a month,
with <let> statements that help you do consecutive substring
capturing from strings, though this is not as powerful as
full regular expressions.)  We have a free Windows tool that
supports embedded Schematron in XML Schemas.

5) If you are writing your own script, the embedded Schematron
XSLT scripts may be useful anyway: Francis Norton made some
really tricky code for extracting things from appinfo, and you 
may find it useful to hack that code to generate, for example,
Perl scripts.

Cheers
Rick Jelliffe
http://www.topologi.com/

----- Original Message ----- 
From: "KRUMPOLEC Martin" <krumpolec@a...>
To: <xml-dev@l...>
Sent: Friday, March 14, 2003 3:25 AM
Subject:  text files & xsd & regex


> Hi,
> 
>   I would like ask if anyone seen something like this :
> 
>   - W3C XSL schema annotated (appinfo) with regular expressions
>   - regexes consists of groups named after child (text only) elements
>   - simple processor reads text file line by line and produces SAX "events"
>   - this processor is driven by content model of our schema
>   - streamed "infoset" matches the schema
> 
>   it is similar to "Regular Fragmentations" by Simon St.Laurent,
>   just a little bit more complicated ...
> 
>   PS: if there is nothing like this I'll have to do it myself :-)
>   
> Thank you
> 
> Martin
> 
> -- 
> Martin Krumpolec <krumpo@p...>
> 
> -----------------------------------------------------------------
> The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
> initiative of OASIS <http://www.oasis-open.org>
> 
> The list archives are at http://lists.xml.org/archives/xml-dev/
> 
> To subscribe or unsubscribe from this list use the subscription
> manager: <http://lists.xml.org/ob/adm.pl>
> 
>

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.