[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: How to select for ' in XPATH?

Subject: RE: How to select for ' in XPATH?
From: Hermann Stamm-Wilbrandt <STAMMW@xxxxxxxxxx>
Date: Wed, 5 Aug 2009 19:56:31 +0200
RE:  How to select for &apos; in XPATH?
> I don't really know anything about the shell that you are using and any
> escaping or unescaping that it is doing, so it's a bit hard to tell.
I used this one:
http://www.xmlsh.org

> The general rule in XPath 2.0 is that if a string literal is enclosed
> in single quotes, an apostrophe should be represented as a pair of
> adjacent apostrophes.
I tried that hint as it was given by Martin, too.

In xmlsh this works:
$ xpath '/*/*/*[contains(normalize-space(.),"""")]' <tst.html
<p>apos and quot: ' " </p>
$ xpath '/*/*/*[contains(normalize-space(.),"''")]' <tst.html
<p>lt and gt: &lt; &gt; </p>
<p>apos and quot: ' " </p>
$

You are right, it is not clear what escaping/unescaping the shell does,
at least I do not see why the second xpath matches both <p>'s.


My real problem seems to be that I need a XPATH 1.0 solution since
I want to do this in a browser environment, right?


The real problem is as follows:
- open an arbitrary web page in Firefox browser

- with a bookmarklet do an arbitrary selection in that page
  (http://en.wikipedia.org/wiki/Bookmarklet)

- then the bookmarklet generates eg. the following xpath:
  "//*[contains(normalize-space(.),'xyz')]"
  where xyz is replaced by the actual selection data

- then Mozilla's document.evaluate() is used to determine the
  corresponding node in the DOM
  (
https://developer.mozilla.org/en/Introduction_to_using_XPath_in_JavaScript)

This all works really fine as long as there is no &apos; character in
the selection ...

It is just this case where I need to figure out how to pass the apos
character to document.evaluate(). For simplicity let us assume that
the selection contains the &apos; character, only.

The XPATH "//*[contains(normalize-space(.),''')]" is definitely wrong,
but what would be right?

Neither "//*[contains(normalize-space(.),'''')]" nor
"//*[contains(normalize-space(.),'\')]" works.]


Interestingly "//*[contains(normalize-space(.),'%20')]"
matches for &quot;

Sadly "//*[contains(normalize-space(.),'%27')]"
does not match for &apos;

This is the JavaScript statement for the evaluation:)]
e = document.evaluate(unescape(s),document,null,
                      XPathResult.FIRST_ORDERED_NODE_TYPE, null);

Any hint what can be done to make this work?
(I have no control over the webpage nor control over user selection)


Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Developer, XML Compiler
WebSphere DataPower SOA Appliances
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschaeftsfuehrung: Erich Baier
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


                                                                           
             "Michael Kay"                                                 
             <mike@xxxxxxxxxxx                                             
             m>                                                         To 
                                       <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>   
             08/05/2009 07:20                                           cc 
             PM                                                            
                                                                   Subject 
                                       RE:  How to select for &apos;  
             Please respond to         in XPATH?                           
             xsl-list@xxxxxxxx                                             
              lberrytech.com                                               
                                                                           
                                                                           
                                                                           
                                                                           





I don't really know anything about the shell that you are using and any
escaping or unescaping that it is doing, so it's a bit hard to tell. The
general rule in XPath 2.0 is that if a string literal is enclosed in single
quotes, an apostrophe should be represented as a pair of adjacent
apostrophes.

Regards,

Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay



> -----Original Message-----
> From: Hermann Stamm-Wilbrandt [mailto:STAMMW@xxxxxxxxxx]
> Sent: 05 August 2009 18:04
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject:  How to select for &apos; in XPATH?
>
>
> Hello,
>
> I tried to select for special characters with XPATH below.
> While I succeeded for some I am unable to select for the
> &apos; character (') and got an error message.
>
> Any hint how this can be done?
>
> $ xmlsh
> $ cat tst.html
> <html><body>
> <p>lt and gt: &lt; &gt; </p>
> <p>apos and quot: &apos; &quot; </p>
> </body></html>
> $ tidy -q -xml tst.html;
> <html>
>   <body>
>     <p>lt and gt: &lt; &gt;</p>
>     <p>apos and quot: ' "</p>
>   </body>
> </html>
>
> $ xpath "/*/*/*[contains(normalize-space(.),'<')]" <tst.html
> <p>lt and gt: &lt; &gt; </p> $ xpath
> "/*/*/*[contains(normalize-space(.),'>')]" <tst.html <p>lt
> and gt: &lt; &gt; </p> $ xpath
> "/*/*/*[contains(normalize-space(.),'\"')]" <tst.html <p>apos
> and quot: ' " </p> $ xpath
> "/*/*/*[contains(normalize-space(.),'\'')]" <tst.html
> Exception running: xpath
> net.sf.saxon.s9api.SaxonApiException: XPath syntax error at char 34 in
> {...ontains(normalize-space(.),...}:
>     Unmatched quote in expression
> $
>
>
> Mit besten Gruessen / Best wishes,
>
> Hermann Stamm-Wilbrandt
> Developer, XML Compiler
> WebSphere DataPower SOA Appliances
> ----------------------------------------------------------------------
> IBM Deutschland Research & Development GmbH Vorsitzender des
> Aufsichtsrats: Martin Jetter
> Geschaeftsfuehrung: Erich Baier
> Sitz der Gesellschaft: Boeblingen
> Registergericht: Amtsgericht Stuttgart, HRB 243294

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2011 All Rights Reserved.