On 02/10/2015 12:24 AM, Liam R E Quin liam@xxxxxx wrote:
> On Mon, 9 Feb 2015 20:21:50 -0000
> "dvint@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>> <dd><p>
>> This is my text
>> with <i>italics content</i> with other text.
>> </p></dd>
>>
>> My output is coming out like this:
>> <ss:Data>This is my text with<ss:font italics="yes">italics
>> content</ss:font>.</ss:Data>
>
> I'd probably do this in two steps -
> (1) match text() and turn one or more whitespace characters into a space,
> probably using replace()
> (2) strip leading space from the first text() in p, and trailing space from
the last;
I do almost exactly this in several applications. I think it's fairly
common.
> watch for
> <p>The man wore<i> black </i>socks</p>
> which is not unlikely in XML made from word processing software.
Slightly more common would be <p>The man wore <i>black </i>socks</p>
where a double-click highlight in the WP software included the trailing
space on the word (someone just told me Word has just stopped doing
this: can anyone confirm?).
More pernicious is the erroneous elision of white-space-only nodes in
mixed content:
<p>The man wore <b>black socks<b> <i>only</i> on Tuesdays.</p>
resulting in The man wore black socksonly on Tuesdays. due to a faulty
xsl:strip-space (white-space-only nodes between subelements in mixed
content should probably never be removed, which is sometimes hard to
explain to people unaccustomed to document-class XML).
///Peter
|