[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Different regex behaviour on Windows & Linux using Sax

Subject: Different regex behaviour on Windows & Linux using Saxon
From: AAS Contractor <AAS.Contractor@xxxxxxx>
Date: Thu, 30 Aug 2007 14:41:02 +0100
 Different regex behaviour on Windows & Linux using Sax
(I have posted this to the Saxon help forum on sourceforge, but thought 
I'd also ask here in case it is not a Saxon-specific problem, but 
something more general that I'm, missing.)

I have a strange problem here. I am using SaxonB 8.9 java version on both 
Linux and Windows. The code is being developed on a Win PC but will 
eventually run in a production environment on a Linux box. However, I get 
different outputs from the same stylesheet depending on which machine I 
run it on. The input is something like  
 
<kwd>stars: individual (RX J0052.9-7158, 2E0053.7-7227, SMC X-2)</kwd> 
 
The desired output for this would be 
 
<kwd>stars: individual<ind>RX 
J0052.9-7158</ind><ind>2E0053.7-7227</ind><ind>SMC X-2</ind></kwd> 
 
And the relevant code I am using is 
 
<xsl:analyze-string select="." regex="\s*\(([^)]+)\)\s*"> 
<xsl:matching-substring> 
<xsl:for-each select="tokenize(regex-group(1),'\s*,\s*')"> 
<ind><xsl:value-of select="."/></ind> 
</xsl:for-each> 
</xsl:matching-substring> 
<xsl:non-matching-substring> 
<xsl:value-of select="."/> 
</xsl:non-matching-substring> 
</xsl:analyze-string> 
 
which does indeed produce the desired output on both platforms. However, 
if the string being matched contains an entity or character reference, the 
string will still be matched on the Windows machine but not on the Linux 
one! eg. 
 
<kwd>stars: individual (RX J0052.9&#8722;7158, 2E0053.7-7227, SMC 
X-2)</kwd> 
 
and 
 
<kwd>stars: individual (RX J0052.9&minus;7158, 2E0053.7-7227, SMC 
X-2)</kwd> 
 
 
produce output of  
 
<kwd>stars: individual<ind>RX 
J0052.9&#8722;7158</ind><ind>2E0053.7-7227</ind><ind>SMC X-2</ind></kwd> 
 
on the Win box but are not matched on the Linux box and passed out as the 
non-matching-substring, eg 
 
<kwd>stars: individual (RX J0052.9&#8722;7158, 2E0053.7-7227, SMC 
X-2)</kwd> 
 
Has anyone got a clue as to why this is happening? 
 
cheers, 
 
Bruce  

************************************************************************
This email (and attachments) are confidential and intended for the addressee(s) only. If you are not the intended recipient please notify the sender, delete any copies and do not take action in reliance on it. Any views expressed are the author's and do not represent those of IOP, except where specifically stated. IOP takes reasonable precautions to protect against viruses but accepts no responsibility for loss or damage arising from virus infection. For the protection of IOP's systems and staff emails are scanned automatically.

IOP Publishing Limited Registered in England under Registration No 467514. Registered Office: Dirac House, Temple Back, Bristol BS1 6BE England
Vat No GB 461 6000 84.

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.