[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

That unwanted white space in HTML output

Subject: That unwanted white space in HTML output
From: Mike Brown <mbrown@xxxxxxxxxxxxx>
Date: Thu, 3 Feb 2000 15:08:42 -0700
textarea white space
Warren Hedley wrote:
> The whitespace between <a> and <img> elements is a fairly 
> common problem [...] can anyone suggest any other element
> types where this behaviour might be necessary?

Yes, all "inline" elements. These are enumerated in the HTML 4 DTDs as the
following:

(strict)
TT | I | B | BIG | SMALL | EM | STRONG | DFN | CODE | SAMP | KBD | VAR |
CITE | ABBR | ACRONYM | A | IMG | OBJECT | BR | SCRIPT | MAP | Q | SUB | SUP
| SPAN | BDO | INPUT | SELECT | TEXTAREA | LABEL | BUTTON

(transitional)
TT | I | B | U | S | STRIKE | BIG | SMALL | EM | STRONG | DFN | CODE | SAMP
| KBD | VAR | CITE | ABBR | ACRONYM | A | IMG | APPLET | OBJECT | FONT |
BASEFONT | BR | SCRIPT | MAP | Q | SUB | SUP | SPAN | BDO | IFRAME | INPUT |
SELECT | TEXTAREA | LABEL | BUTTON

I believe a clause should be included in a future version of the XSLT spec:
"When emitting a result tree as HTML, whitespace should never be added
inside inline elements."

Example:

What would normally be emitted as unindented XML like this:
<p><a href="foo"><img src="bar"/></a><br/>some text</p>

...could be emitted as indented HTML like this:
<p>
<a href="foo"><img src="bar"/></a><br/>some text
</p>


The reason why this rule is needed is because if whitespace is added, it and
any adjacent whitespace is interpreted as a single "word separator" relative
to adjacent text. The browser is supposed to render this separator in a
manner apporpriate to the language script being used, which isn't something
that is always predictable. In the Latin-based languages, the word separator
is a breaking space.

In the case of inline images, applets and objects, you end up with the
image, applet or object being equivalent to some text, with the bottom edge
aligned along the baseline of adjacent text, as per the spec. This is
normally desirable behavior, but can be problematic if you are trying to
stack images on top of each other. The space allotted for descending
characters and the space between the bottom edge of descenders and the top
edge of the next row of text is often undesirable.

I made an example of this at http://www.skew.org/xml/misc_demos/whitespace/
and reported it to James Clark as an argument for changing the behavior of
XT's HTMLOutputHandler. He gave me a simple "thanks" for the info, but the
problem has yet to be resolved.

In the mean time, I've modified HTMLOutputHandler.java with an ugly
workaround, removing 'br' from the list of blockElements (which seems to be
an error anyway). This of course doesn't resolve every situation, but was
enough for my purposes, for now.


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.