That unwanted white space in HTML output
Warren Hedley wrote: > The whitespace between <a> and <img> elements is a fairly > common problem [...] can anyone suggest any other element > types where this behaviour might be necessary? Yes, all "inline" elements. These are enumerated in the HTML 4 DTDs as the following: (strict) TT | I | B | BIG | SMALL | EM | STRONG | DFN | CODE | SAMP | KBD | VAR | CITE | ABBR | ACRONYM | A | IMG | OBJECT | BR | SCRIPT | MAP | Q | SUB | SUP | SPAN | BDO | INPUT | SELECT | TEXTAREA | LABEL | BUTTON (transitional) TT | I | B | U | S | STRIKE | BIG | SMALL | EM | STRONG | DFN | CODE | SAMP | KBD | VAR | CITE | ABBR | ACRONYM | A | IMG | APPLET | OBJECT | FONT | BASEFONT | BR | SCRIPT | MAP | Q | SUB | SUP | SPAN | BDO | IFRAME | INPUT | SELECT | TEXTAREA | LABEL | BUTTON I believe a clause should be included in a future version of the XSLT spec: "When emitting a result tree as HTML, whitespace should never be added inside inline elements." Example: What would normally be emitted as unindented XML like this: <p><a href="foo"><img src="bar"/></a><br/>some text</p> ...could be emitted as indented HTML like this: <p> <a href="foo"><img src="bar"/></a><br/>some text </p> The reason why this rule is needed is because if whitespace is added, it and any adjacent whitespace is interpreted as a single "word separator" relative to adjacent text. The browser is supposed to render this separator in a manner apporpriate to the language script being used, which isn't something that is always predictable. In the Latin-based languages, the word separator is a breaking space. In the case of inline images, applets and objects, you end up with the image, applet or object being equivalent to some text, with the bottom edge aligned along the baseline of adjacent text, as per the spec. This is normally desirable behavior, but can be problematic if you are trying to stack images on top of each other. The space allotted for descending characters and the space between the bottom edge of descenders and the top edge of the next row of text is often undesirable. I made an example of this at http://www.skew.org/xml/misc_demos/whitespace/ and reported it to James Clark as an argument for changing the behavior of XT's HTMLOutputHandler. He gave me a simple "thanks" for the info, but the problem has yet to be resolved. In the mean time, I've modified HTMLOutputHandler.java with an ugly workaround, removing 'br' from the list of blockElements (which seems to be an error anyway). This of course doesn't resolve every situation, but was enough for my purposes, for now. XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format