[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Sharing Techniques: White Spaces in HTML pages by XSLT


response.write spaces
> >  Jonathan:
> J> Is your XSLT output supposed to be HTML ? What charset are you
> J> using ? Do you really need special (non-breaking) spaces, or will
> J> regular whitespace do ?
>
> Output was HTML. The charset would have been whetever MSXML3 put on it
> automatically (probably UTF-16). I just needed a series of spaces that
> will not be collapsed. Is that not what nbsp is for?

You could try being explicit about the output charset :

<xsl:output method="html" encoding="iso-8859-1" />

This will only be honored if your stylesheet is called either via
<?xml-stylesheet?>
or the DOMDocument.transformNodeToObject method. The
DOMDocument.transformNode method returns an UTF-16 BSTR and
ignores the "encoding" attribute.

If you're transforming from ASP, getting the right encoding out is a bit
tricky.
The naive way :
    Response.Write doc.transformNode(stylesheet)
will make the XSL processor produce UTF-16, which will then be converted
to iso-8859-1 (actually, to whatever codepage was specified through
Response.CodePage) for sending to the client by Response.Write.
And it gets you in trouble because MSXML, thanks to the method="html"
attribute, will still have inserted a META tag which describes the document
as being in iso-8851-1, and ASP won't touch it. The browser may get
quite confused (IE does).

The correct way :
    doc.transformNodeToObject stylesheet, Response
will plug the XSL processor directly into the HTTP response stream.
Just make sure you either :
 - tell your stylesheet's output encoding to iso-8859-1 (ASP's default
charset),
as shown above
 - use Response.Charset = "utf-8" to inform the client that your stylesheet
outputs UTF-8 (MSXML's default charset for XSL output to a stream)
 - set both to any other charset, as long as they're in sync you shouldn't
have
any problems.

When Response.Write encounters Unicode characters that can not be
represented in ASP's output charset they are replaced by another character
within
the charset. This means information is lost.

When you tell the XSL processor to output iso-8859-1, characters
outside the charset will be replaced by character references (e.g. &#160;)
so they will get to the client intact.

This may be what is happening to you with the nbsp characters, although
there is a non-breaking space character in iso-8859-1, at the same
codepoint (160) even (or is it only in windows-1252 ?), and ASP
correctly maps U+00A0 to that character.

The other possibility is that your client is broken and treats the U+00A0
character as regular whitespace, and only treats "&nbsp;" as a non-breaking
space. This is horrid but I imagine some implementations might have taken
such a shortcut. It's easy to check: write an HTML file that has "&#160;"
where
you would put "&nbsp;", and if these spaces are ignored, the browser is
broken.

If the problem is on the client side, you'll have to resort to hacks such as
disable-output-encoding to get the browser to render non-breaking spaces.

On the other hand, the abuse of &nbsp; is IMHO one of the worst
uglinesses of HTML. Nowadays you can do without most of the time.
For example when indenting code, you could set margins with CSS :

<style>
.code DIV { margin-left:2em; }
</style>

<div class="code" style="font-family:courier">
void f(void) {
<div>
int x;
<br>
<br>
return 12;
</div>
}
</div>

The indentation is much more predictable than with &nbsp; (you never know
the
width of &nbsp; !)
Notice also that I didn't have to keep track of the indentation level.

In other cases, you can use the "width" CSS attribute with an empty element,
as in :

<div>This whitespace for rent : [<span style="width:3em;"></span>]</div>

<pre> also works, you can always override the font it uses if you don't want
a monospaced font, for example. Though this may be seen as another kind of
HTML abuse.

There's also CSS's "whitespace:pre", but IE supports it only in IE6 in
"standards-compliant mode". I don't know about NS6.

All in all, I'm not sure there's still a need for &nbsp; outside of
non-breaking
spaces... anyone care to comment ?

Hope this helps
Jonathan



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.