|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: regular expressions
David Tolpin wrote:
>>> s-pattern="""
>>> comment = "\(([^\(\)\\]|\\.)*\)"
>>> atom = "[a-zA-Z0-9!#$%&'*+\-/=?\^_`{|}~]+"
>>> atoms = atom "(\." atom ")*"
>>> [...]
>>>
>>>Why isn't it done?
>>
>>
>>HyLex used a similar syntax for regular expressions.
>>I've always wondered why the idea never caught on elsewhere.
>>(Then again, none of the ideas from HyTime ever really
>>caught on...)
>
>
> In fact, I've implemented it in an extension datatype library for my
Relax
> NG validator; it is only 70 lines of code in Scheme, after all. Proved
> to be very useful for debugging.
Very clever. But a naive implementation would just recursively
concatenate the strings to make a single regex strings. Could you
elaborate on the debugging advantage, i.e., how it makes it easier for a
schema writer to debug regular expressions?
Jeni Tennison used the same idea with a slightly different syntax in her
DTLL proposal (I've lost the URL). Her idea had the added twist that an
application could receive the results of the regular expression parse as
a structured result, e.g., through a SAX API. Thus, using your example,
the string "(David Tolpen)David.Tolpin@n..." might produce the
'infoset':
<start>
<comment>(David Tolpen)</comment>
<local-part>
<atoms>
<atom>David</atom>.<atom>Tolpin</atom>
</atoms>
</local-part>@<domain>
<atoms>
<atom>nospam</atom>.<atom>net</atom>
</atoms>
</domain>
</start>
This still seems a fruitful avenue to explore.
Bob Foster
http://xmlbuddy.com/
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








