[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Selecting First Direct Sibling

Subject: Re: Selecting First Direct Sibling
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Tue, 21 Aug 2007 11:48:43 -0400
Re:  Selecting First Direct Sibling
Wasiq,

At 10:24 AM 8/21/2007, Hugh wrote:
In my classes - there is typically at least one student who will get surprised by this.

Everything in the input XML will create a node - even whitespace (tabs, line end, LF/CR) unless you take explicit steps to remove the white space as not important to your processing. A text node IS an actual node whose value is the text - even if it is white space.

This is the old markup bugbear, the question of "when is whitespace really whitespace, and when is it only whitespace"?


XML does try in certain contexts to distinguish between "significant" whitespace (that is, space that's actually part of the data content of the document) and "insignificant" whitespace (space that's only there to make the marked-up version more legible, and is not meant to be respected by an application).

Typically this distinction relies on a schema. For example, whitespace inside a paragraph (including what in XSLT constitutes whitespace-only text nodes), should be significant:

<sec>
  <p>In my <term>paragraph</term> <emph>I need my whitespace!</emph></p>
</sec>

... but whitespace between paragraphs (or here, inside the 'sec' but not the 'p') is not. We can tell this from a DTD or schema that declares p elements to have #PCDATA content or the equivalent, whereas presumably the sec does not allow #PCDATA.

But not everyone has a schema, and some folks don't even want one.

Accordingly, the XSLT rule (sometimes observed in the breach by certain well-known vendors) is to save all whitespace unless the stylesheet says to strip it, using the xsl:strip-space and xsl:preserve-space top-level elements.

If you want to be both clean and safe with your whitespace, you'll trust your processor to do this, and name only elements that have only element contents (as specified in a schema to which your input is known to be valid) in xsl:strip-space.

Then, expressions like node() will work the way you want. They'll select whitespace-only nodes, as always (they're nodes too) -- but the ones you don't care about will have been stripped from your source tree.

Cheers,
Wendell

----- Original Message ----- From: "Wasiq Shaikh" <wasiq911@xxxxxxxxxxx>
To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Tuesday, August 21, 2007 9:01 AM
Subject: RE:  Selecting First Direct Sibling


Oh I see. I had always thought that node() would select an actual node or text. Didn't think it would select blank spaces.


======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.