Subject: RE: How to select for ' in XPATH?
From: Hermann Stamm-Wilbrandt <STAMMW@xxxxxxxxxx>
Date: Wed, 5 Aug 2009 19:56:31 +0200
|
> I don't really know anything about the shell that you are using and any
> escaping or unescaping that it is doing, so it's a bit hard to tell.
I used this one:
http://www.xmlsh.org
> The general rule in XPath 2.0 is that if a string literal is enclosed
> in single quotes, an apostrophe should be represented as a pair of
> adjacent apostrophes.
I tried that hint as it was given by Martin, too.
In xmlsh this works:
$ xpath '/*/*/*[contains(normalize-space(.),"""")]' <tst.html
<p>apos and quot: ' " </p>
$ xpath '/*/*/*[contains(normalize-space(.),"''")]' <tst.html
<p>lt and gt: < > </p>
<p>apos and quot: ' " </p>
$
You are right, it is not clear what escaping/unescaping the shell does,
at least I do not see why the second xpath matches both <p>'s.
My real problem seems to be that I need a XPATH 1.0 solution since
I want to do this in a browser environment, right?
The real problem is as follows:
- open an arbitrary web page in Firefox browser
- with a bookmarklet do an arbitrary selection in that page
(http://en.wikipedia.org/wiki/Bookmarklet)
- then the bookmarklet generates eg. the following xpath:
"//*[contains(normalize-space(.),'xyz')]"
where xyz is replaced by the actual selection data
- then Mozilla's document.evaluate() is used to determine the
corresponding node in the DOM
(
https://developer.mozilla.org/en/Introduction_to_using_XPath_in_JavaScript)
This all works really fine as long as there is no ' character in
the selection ...
It is just this case where I need to figure out how to pass the apos
character to document.evaluate(). For simplicity let us assume that
the selection contains the ' character, only.
The XPATH "//*[contains(normalize-space(.),''')]" is definitely wrong,
but what would be right?
Neither "//*[contains(normalize-space(.),'''')]" nor
"//*[contains(normalize-space(.),'\')]" works.]
Interestingly "//*[contains(normalize-space(.),'%20')]"
matches for "
Sadly "//*[contains(normalize-space(.),'%27')]"
does not match for '
This is the JavaScript statement for the evaluation:)]
e = document.evaluate(unescape(s),document,null,
XPathResult.FIRST_ORDERED_NODE_TYPE, null);
Any hint what can be done to make this work?
(I have no control over the webpage nor control over user selection)
Mit besten Gruessen / Best wishes,
Hermann Stamm-Wilbrandt
Developer, XML Compiler
WebSphere DataPower SOA Appliances
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschaeftsfuehrung: Erich Baier
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
"Michael Kay"
<mike@xxxxxxxxxxx
m> To
<xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
08/05/2009 07:20 cc
PM
Subject
RE: How to select for '
Please respond to in XPATH?
xsl-list@xxxxxxxx
lberrytech.com
I don't really know anything about the shell that you are using and any
escaping or unescaping that it is doing, so it's a bit hard to tell. The
general rule in XPath 2.0 is that if a string literal is enclosed in single
quotes, an apostrophe should be represented as a pair of adjacent
apostrophes.
Regards,
Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay
> -----Original Message-----
> From: Hermann Stamm-Wilbrandt [mailto:STAMMW@xxxxxxxxxx]
> Sent: 05 August 2009 18:04
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: How to select for ' in XPATH?
>
>
> Hello,
>
> I tried to select for special characters with XPATH below.
> While I succeeded for some I am unable to select for the
> ' character (') and got an error message.
>
> Any hint how this can be done?
>
> $ xmlsh
> $ cat tst.html
> <html><body>
> <p>lt and gt: < > </p>
> <p>apos and quot: ' " </p>
> </body></html>
> $ tidy -q -xml tst.html;
> <html>
> <body>
> <p>lt and gt: < ></p>
> <p>apos and quot: ' "</p>
> </body>
> </html>
>
> $ xpath "/*/*/*[contains(normalize-space(.),'<')]" <tst.html
> <p>lt and gt: < > </p> $ xpath
> "/*/*/*[contains(normalize-space(.),'>')]" <tst.html <p>lt
> and gt: < > </p> $ xpath
> "/*/*/*[contains(normalize-space(.),'\"')]" <tst.html <p>apos
> and quot: ' " </p> $ xpath
> "/*/*/*[contains(normalize-space(.),'\'')]" <tst.html
> Exception running: xpath
> net.sf.saxon.s9api.SaxonApiException: XPath syntax error at char 34 in
> {...ontains(normalize-space(.),...}:
> Unmatched quote in expression
> $
>
>
> Mit besten Gruessen / Best wishes,
>
> Hermann Stamm-Wilbrandt
> Developer, XML Compiler
> WebSphere DataPower SOA Appliances
> ----------------------------------------------------------------------
> IBM Deutschland Research & Development GmbH Vorsitzender des
> Aufsichtsrats: Martin Jetter
> Geschaeftsfuehrung: Erich Baier
> Sitz der Gesellschaft: Boeblingen
> Registergericht: Amtsgericht Stuttgart, HRB 243294
|