RE: Is it possible to get " " in output rather th

Play the video

Subject: RE: Is it possible to get " " in output rather than the encoded character?
From: "Steven Reddie" <smr@xxxxxxxxxxxxxx>
Date: Thu, 20 May 2004 09:56:16 +1000

Thanks to you, Michael and David for confirming that this is a bug.

I do realise that the processing looses such information, in fact I of
course have a macro which substitutes &nbsp; with &#160;.  I was merely
thinking about it at a higher level when saying that the &#160; should be
left as is.  What I expected to see in the output was either "& # x A 0 ;"
or "& # 1 6 0 ;".  Getting "& n b s p ;" would be a nice bonus when using
the html output method, but frankly I wouldn't expect it to do so.

Regards,

Steven

-----Original Message-----
From: Passin, Tom [mailto:tpassin@xxxxxxxxxxxx] 
Sent: Thursday, 20 May 2004 7:45 AM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: RE:  Is it possible to get "&nbsp;" in output rather than the
encoded character?


> From: Steven Reddie [mailto:smr@xxxxxxxxxxxxxx]

> Sorry, I wasn't being accurate.  I meant is it reasonable to
> expect &#160; to be left as &#160; when outputting us-ascii.  
> This is obviously the same as expecting the two consecutive 
> bytes 0xC2 0xA0 in a utf-8 input stream to be translated to 
> &#160; for a us-ascii output stream.  I don't see anything 
> wrong with such an expectation.
> 

What you haven't absorbed yet is that there is nothing to be "left as &
# 1 6 0 ;".  When that xml parser parses the xml file and encounters
such a character, it sticks a unicode non-breaking space into the text
that it constructs.  Later, the xslt processor receives that text and
has no way to know that it startedout in life as & # 1 6 0 ;, & n b s
p;, or whatever it was.

So when the xslt transformation is done and gets handed off to the
serializer, there is no memory of the original source of that character
(and how could there be, really - after all, you might have added the
character during the transformation).  So the serializer has to figure
out what to output,and if the encoding does not support a character, how
to escape it.  The easiest and most reliable thing to do is output an &
# x 0 0 ; or & # x 0 0 0 0 ; character reference,so that is what you
usually get.

The above is what is "reasonable" to expect - output that character if
the encoding supports it, output an appropriate character reference if
not, and oh, yes, if you are lucky maybe output the corresponding html
entity or maybe not.  Any other behavior is nonconforming and incorrect.

Cheers,

Tom P

Current Thread
RE: Is it possible to get " " in output rather than the encoded character?, (continued) Michael Kay - Wed, 19 May 2004 16:12:15 +0100 Michael Kay - Wed, 19 May 2004 15:43:17 +0100 Steven Reddie - Thu, 20 May 2004 01:19:49 +1000 Passin, Tom - Wed, 19 May 2004 17:44:48 -0400 Steven Reddie - Thu, 20 May 2004 09:56:16 +1000 <=

<- Previous	Index	Next ->
RE: Is it possible to get "&n, Passin, Tom	Thread	Subject: RE: multiple x, Robert Stukey
Re: attributes to nested elem, Tim Lord	Date	RE: editing HTML inside <![CD, Daniel Joshua
	Month

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >

RE: Is it possible to get "&nbsp;" in output rather th

PURCHASE STYLUS STUDIO ONLINE TODAY!

Download The World's Best XML IDE!

RE: Is it possible to get " " in output rather th