|
[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message] XQuery - RegEx Pattern MatcherMichael Kay mike at saxonica.comFri Jun 19 12:50:07 PDT 2009
These things tend to be easier in XSLT which has the more powerful xsl:analyze-string instruction. <xsl:analyze-string regex='"[^"]*?"|[^;]*'> <xsl:matching-substring> .. process one field .. In XQuery think I would start by doing a replace() to replace semicolons-not-within-quotes by some other delimiter (e.g. a PUA character), and then do a tokenize() to split the string on this new delimiter. Alternatively, splitting the string using a recursive function using substring-before() and substring-after() might be just as easy. Regards, Michael Kay http://www.saxonica.com/ http://twitter.com/michaelhkay > -----Original Message----- > From: http://x-query.com/mailman/listinfo/talk [mailto:http://x-query.com/mailman/listinfo/talk] > Sent: 19 June 2009 11:29 > To: Michael Kay; http://x-query.com/mailman/listinfo/talk > Subject: Re: RE: XQuery - RegEx Pattern Matcher > > I am trying to "read" CSV data like this : > > one;"two;stilltwo";three;"four;stillfour";five > > this should resolve in something like this : > ... > <element>one</element> > <element>two;stilltwo</element> > <element>three</element> > <element>four;stillfour</element> > <element>five</element> > ... > > if there is no separator(";") allowed within a text it is > easy with just splitting a line with ";". > > But if there can be a ";" as a text, than I have to use RegEx. > I succeded in finding a XQuery-RegEx if in one line there is > only one case where a ";" is used as text. > > But I need to find every match, so I used the \\G . Worked > fine, so I hoped to reuse it in XQuery... > > > > -------- Original-Nachricht -------- > > Datum: Fri, 19 Jun 2009 10:20:30 +0100 > > Von: "Michael Kay" <http://x-query.com/mailman/listinfo/talk> > > An: http://x-query.com/mailman/listinfo/talk, http://x-query.com/mailman/listinfo/talk > > Betreff: RE: XQuery - RegEx Pattern Matcher > > > > > The XPath regular expression language does not recognize \G and it > > does not recognize non-capturing groups. > > > > As far as matches() is concerned, there is no distinction between > > capturing and non-capturing groups, so replace "(?:" by "(". > > > > I suspect you wanted your regex to contain "\G". In Java > you need to > > escape this as "\\G"; in XPath/XQuery, backslash is not a special > > character and does not need to be escaped. However, there's > no "\G" in > > XPath regular expressions anyway. In Java it means "the end of the > > previous match"; but XQuery is a functional language, so > "previous" is > > meaningless. At this stage I give up because I'm not sure > what you are > > trying to do: you haven't supplied enough of your code. > > > > Regards, > > > > Michael Kay > > http://www.saxonica.com/ > > http://twitter.com/michaelhkay > > > > > -----Original Message----- > > > From: http://x-query.com/mailman/listinfo/talk > > > [mailto:http://x-query.com/mailman/listinfo/talk] On Behalf Of http://x-query.com/mailman/listinfo/talk > > > Sent: 19 June 2009 09:33 > > > To: http://x-query.com/mailman/listinfo/talk > > > Subject: XQuery - RegEx Pattern Matcher > > > > > > Hi, > > > I am trying to use a RegEx within XQuery. In general that > works fine. > > > Now I have a more complex RegEx to work with CSV-files(these CSV > > > have ";" as separator). > > > I use can the following without problems in Java : > > > > > > Pattern Regex = Pattern.compile( > > > "\\G(?:^|;)(?:\"((?:[^\"]|\"\")*)\"|([^\";]*))"); > > > ... > > > > > > But in XQuery > > > let $regularExpr :='\\G(?:^|;)(?:\"((?:[^\"]|\"\")*)\"|([^\";]*))' > > > ... > > > if (matches($row,$regularExpr) ) then ( ... > > > > > > just gives the error : > > > > > > Error at character 4 in regular expression > > > "\\G(?:^|;)(?:\"((?:[^\"]|\"\")...": expected ()) > > > > > > > > > I tried the optional flags (i, x, ...) but always with the same > > > result... > > > What is wrong with this RegEx ? > > > > > > P.S. :I run the XQuery from Java with Saxon. > > > > > > > > > -- > > > GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate und > > > Telefonanschluss für nur 17,95 Euro/mtl.!* > > > http://portal.gmx.net/de/go/dsl02 > > > _______________________________________________ > > > http://x-query.com/mailman/listinfo/talk > > > http://x-query.com/mailman/listinfo/talk > > > > > > _______________________________________________ > > http://x-query.com/mailman/listinfo/talk > > http://x-query.com/mailman/listinfo/talk > > -- > GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate und > Telefonanschluss für nur 17,95 Euro/mtl.!* > http://portal.gmx.net/de/go/dsl02
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|






