[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: [OT] bugs in JDK regex engine ?

  • From: "Mukul Gandhi" <gandhi.mukul@g...>
  • To: xml-dev@l...
  • Date: Mon, 4 Feb 2008 09:27:58 +0530

Re:  [OT] bugs in JDK regex engine ?
Thanks Mike, for your comments.

Below is a simple example I tried with JDK 1.6.0.

String str = "<root><abc x='1'>text1</abc><pqr y='1'>text2</pqr></root>";

Pattern pattern = Pattern.compile("<[^/]+>");  //anything from '<' to
'>', and not having '/'
Matcher matcher = pattern.matcher(str);

while (matcher.find()) {
   String group = matcher.group();
   System.out.println(group);
}

'str' is a String representation of a XML fragment.

I want to extract all pieces from the string (the tokens), which form
a start tag (including attribute parts).

I am expecting output:
<root>
<abc x='1'>
<pqr y='1'>

But the output produced by the above program is:
<root><abc x='1'>
<pqr y='1'>

You could notice, that the 1st token is larger ...

Can you or anybody please help ...

On Feb 3, 2008 10:52 PM, Michael Kay <mike@s...> wrote:
> Saxon translates XML Schema and XPath regexes into JDK regexes, so it's
> pretty heavily dependent on the underlying regex engine. There are some
> cases where the behaviour is very incompletely specified, for example the
> effect of the "i" (case-blind) flag, but I've found very few cases where the
> expected behaviour is clear and the actual behaviour differs. In my
> experience, it's much more likely to be a user error.
>
> However, I think it might be stretching the (highly elastic) patience of
> this list to hold a discussion of JDK regex behaviour here.
>
> In any case, I think the whole concept of checking XML well-formedness using
> regular expressions is misguided, for the simple reason that (on theoretical
> grounds) regular expressions aren't up to the job.
>
> Michael Kay
> http://www.saxonica.com/


-- 
Regards,
Mukul Gandhi


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Cast Your Vote

We need your help – Vote for DataDirect XML Products!

  • Best SOA or XML site

Winners and finalists announced at SOA World Conference in November.

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2007 All Rights Reserved.