[Home] [By Thread] [By Date] [Recent Entries]

  • From: tpassin@h...
  • To: "Roger L. Costello" <costello@m...>, xml-dev@l...,xerces-j-dev@x...
  • Date: Sun, 03 Sep 2000 18:35:54 -0400

 Roger L. Costello asked -

> Consider this regular expression:
>
> (.)+\.(gif|jpg|jpeg|bmp)
>
> As I interpret this regular expression it says, "one or more occurrences
> of any character, followed by a dot, followed by either gif or jpg or
> jpeg or bmp".  Correct?
>
> Here's my question - why is it that two schema validators (Oracle and
> xerces 1.2) both accept the following strings:
>
> images\mighty_oj.gif
> images\omega.jpg
> images\wheateena.jpg
>
> but reject these strings:
>
> images\champion.gif
> images\greenPower.jpg
> images\juiceman.jpg
>
I'm not sure if it's the same in xml schemas, but try this version:

(.+)\.(gif|jpg|jpeg|bmp)

In python, the original expression matches the last character of the name,
as well as returning the extension (since it is in parentheses).  The new
version matches correctly - at least, in python:

import re
patre=r".+\.(gif|jpg|jpeg|bmp)"
pat=re.compile(patre)

print pat.findall(r'images\champion.gif')
>>> [('n', 'gif')]

# New version
patre=r"(.+)\.(gif|jpg|jpeg|bmp)"
 pat=re.compile(patre)

print pat.findall(r'images\champion.gif')
>>>[('images\\champion', 'gif')]

Cheers,

Tom Passin


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member