[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Xpath Syntax Issue

Subject: Re: Xpath Syntax Issue
From: Nathan Tallman <ntallman@xxxxxxxxx>
Date: Sun, 24 Jun 2012 12:26:25 -0400
Re:  Xpath Syntax Issue
Sorry, here's my XSLT (remove.xsl):

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:s="http://www.sitemaps.org/schemas/sitemap/0.9"
    exclude-result-prefixes="s"
>

    <xsl:output method="xml" encoding="UTF-8" indent="yes"/>

    <xsl:strip-space elements="*"/>

    <!-- Standard copy -->
    <xsl:template match="*">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="s:urlset/s:url[normalize-space(s:loc) = 'URL']"/>

</xsl:stylesheet>

XML Snippet (sitemap1.xml):
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="
     http://www.sitemaps.org/schemas/sitemap/0.9
     http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">

     <url>
          <loc>URL</loc>
          <lastmod>2012-06-23T13:37:27+00:00</lastmod>
          <changefreq>monthly</changefreq>
          <priority>1.0</priority>
     </url>
     ....
</urlset>

Command used in Linux:
xsltproc -o sitemapb.xml remove.xsl sitemap1.xml

(In case anyone is wondering why I want to remove URLs from a sitemap,
there are a few pages generated by a script, purely for crawling
reasons, as the pages don't crawl well otherwise. The sitemap feeds
the indexing engine for our website and I don't want these artificial
pages cluttering up search results. So after the sitemap is generated,
I want to run this XSLT to remove the URLs before the indexer starts.)

Thanks,
Nathan


On Sun, Jun 24, 2012 at 11:31 AM, Michael Kay <mike@xxxxxxxxxxxx> wrote:
>
>
> On 24/06/2012 15:35, Nathan Tallman wrote:
>>
>> Is there any reason why this transformation works in Oxygen, using
>> Saxon and xsltproc, yet doesn't work from the Linux command line using
>> xsltproc? When running from the command line, all the attributes from
>> urlset are removed, but the unwanted URLs remain.
>
>
> I for one haven't followed this thread in detail, so I'm not sure what
"this
> transformation" refers to.
>
> Michael Kay
> Saxonica
>
>>
>> On Sat, Jun 23, 2012 at 10:56 PM, Nathan Tallman<ntallman@xxxxxxxxx>
>>  wrote:
>>>
>>> Thanks Chris. I had just found this explanation on
>>>
>>>
<http://stackoverflow.com/questions/3836121/xslt-does-not-work-when-i-include
-xmlns-http-www-sitemaps-org-schemas-sitemap>
>>> when your email came in. This takes care of it.
>>>
>>> Much appreciation.
>>> Nathan
>>>
>>> On Sat, Jun 23, 2012 at 10:51 PM, Christopher R. Maden<crism@xxxxxxxxx>
>>>  wrote:
>>>>
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA1
>>>>
>>>> On 06/23/2012 10:38 PM, Nathan Tallman wrote:
>>>>>
>>>>> I still wasn't getting the results in my application, so I created
>>>>> pets.xml and sure enough the template worked. It only works with
>>>>> my original document if I remove attributes found in the root
>>>>> element.
>>>>>
>>>>> The original first 6 lines:<?xml version="1.0" encoding="UTF-8"?>
>>>>> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
>>>>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>>>> xsi:schemaLocation=" http://www.sitemaps.org/schemas/sitemap/0.9
>>>>> http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
>>>>>
>>>>> I had to remove all attributes from<urlset>  before the XSL would
>>>>> work. Do I need to reference the schema in my XSL?
>>>>
>>>> Ahh... the good ol namespace FAQ.
>>>>
>>>> Every element type name is a pair: namespace URI and local name.
>>>>
>>>> What you thought was null-namespace plus species is in fact
>>>> http://www.sitemaps.org/schemas/sitemap/0.9 plus species (often
>>>> written as {http://www.sitemaps.org/schemas/sitemap/0.9}species).  An
>>>> XPath expression matching just species matches {}species, which is a
>>>> *different name* than
>>>> {http://www.sitemaps.org/schemas/sitemap/0.9}species.
>>>>
>>>> You need, in your XSLT, to declare something like
>>>> xmlns:sitemap="http://www.sitemaps.org/schemas/sitemap/0.9" and then
>>>> use sitemap:species in your XPath.  (A shorter prefix might be in
>>>> order, but a prefix is required for XSLT 1.0 and recommended (IMO) for
>>>> clarity for XSLT 2.0.)
>>>>
>>>> ~Chris
>>>> - --
>>>> Chris Maden, text nerd<URL: http://crism.maden.org/>
>>>> LIVE FREE: vote for Gary Johnson, Libertarian for President.
>>>>     <URL: http://garyjohnson2012.com/>    <URL: http://lp.org/>
>>>> GnuPG fingerprint: DB08 CF6C 2583 7F55 3BE9  A210 4A51 DBAC 5C5C 3D5E
>>>>
>>>>
>>>> -----BEGIN PGP SIGNATURE-----
>>>> Version: GnuPG v1.4.10 (GNU/Linux)
>>>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>>>>
>>>> iQEcBAEBAgAGBQJP5oCxAAoJEEpR26xcXD1eHSUH/0E0F49MPJJJ1j/1lB9Zw0zK
>>>> gNBxalYi/zVpHCgSYNzdXYrdvYWZFIDkQng4opPXBLA5nbWvaJ4qpObrMbB80cmN
>>>> unUmPhrb5IkuYx1adgCvNzxlRuabdG06jUUbO11kq8HPbyWH74tEsFP5+IPrTOpn
>>>> /xmZTkR5Z0kO93yl6osUbyeq42dF34HmyQKVwWQD0dXHVM8q5BUbVesnxmjdGoE9
>>>> 7zZTJH+r3K0WhGbM0Iq91wZ4LF3qTT25gih+TBF3cMAzsBCGaxzzFlRoJj0qDVj2
>>>> q6DW/awQW+JU8VxRavaoQG1rk1No/k/GkStSv+UXCBdl3qwdwbVIXWdXaliZ0/o=
>>>> =YGiD
>>>> -----END PGP SIGNATURE-----

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.