[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Including URL-encoded query string in XHTML docume

Subject: Re: Including URL-encoded query string in XHTML document
From: Jeni Tennison <mail@xxxxxxxxxxxxxxxx>
Date: Thu, 11 Jan 2001 18:07:52 +0000
amp in query string
Hi Yelena,

> I'm trying to process an XML data feed that contains URL-encoded query
> strings, like the following:
>
>         <item url="research.exe?ticker=GS&type=1" date="01/01/2000">goldman
> sachs</item>    

The isn't well-formed XML and the XML parser that you're using should
complain when it sees it. In XML, it's illegal to have a '&' character
that doesn't mark the start of a general entity reference. The XML you
need to use is:

<item url="research.exe?ticker=GS&amp;type=1"
      date="01/01/2000">goldman sachs</item>

The XML that you see in a file is just a *serialisation* of a node
tree. In the node tree, entity references are substituted for whatever
they reference. So the node tree for the above looks like:

+- (element) item
   | +- (attribute) url = research.exe?ticker=GS&type=1
   | +- (attribute) date = 01/01/2000
   +- (text) goldman sachs

Note the url attribute has a value with the character '&' in it rather
than the entity reference.

> Any advice on what is the best way to pass a URL-encoded string through the
> XSLT transformation?
> I substituted "&" with "&amp;" in the original data, but then the output
> XSLT document also contains &amp; and there seems to be no way to print "&"
> as it is.
> Using <xsl:output method="html" > or "disable-output-escape" directives did
> not seem to help. 

When you create some output with XSLT, if it's creating XML it sticks
to XML rules.  So because XML doesn't allow a '&' that isn't the start
of an entity reference, the XSLT processor outputs '&amp;' instead.

When you tell it to output in HTML with <xsl:output method="html" />,
it still sticks with this rule because you can have entity references
in HTML as well, and you need to know when an '&' is an ampersand
character and when it's the start of an entity reference.  Almost
always, an '&' in an HTML node tree will be serialised as '&amp;' when
it's written to a file.

But this shouldn't be a problem. Whatever program looks at the HTML
and reads it should interpret the '&amp;' correctly and Do The Right
Thing. You shouldn't have to worry about it. Obviously it is causing
you a problem though - is it really the case that if you create an
HTML document with the following links in it:

<p>
  <a href="research.exe?ticker=GS&amp;type=1">goldman sachs
  (entity)</a>;
  <a href="research.exe?ticker=GS&type=1">goldman sachs
  (character)</a>;
</p>

that the second works and the first doesn't?  If so, you've got a
dodgy browser.

> and use the stylesheet below to construct an href tag for each item
> element:
>
>         <xsl:template match="item">
>                 <a>
>                         <xsl:attribute name="href">
>                                 <xsl:value-of select="@url">
>                         </xsl:attribute>
>                         <xsl:value-of select="." />
>                 </a>
>         </xsl:template>

It's not directly relevant, but this is equivalent to:

<xsl:template match="item">
   <a href="{@url}"><xsl:value-of select="." /></a>
</xsl:template>

I hope that helps,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.