[XML-DEV Mailing List Archive Home]
[By Thread]
[By Date]
[Recent Entries]
[Reply To This Message]
C1 characters in XML 1.0 and HTML 4
- From: "Waters, Michael, Springer US" <Mike.Waters@springer.com>
- To: "XML Developers List" <xml-dev@lists.xml.org>
- Date: Sat, 12 Mar 2011 18:02:09 -0500
I found
some related material in the list archives, but I wanted to check my
understanding of the use of C1 characters in XML 1.0 and in HTML 4.
We have
a UTF-8 encoded XML document that has gone through a number of conversions and
import/export routines into/out of a CMS. At all times, the XML document was
valid against the DTD, and in Oxygen everything seems fine. No errors were
reported in the workflow until a late stage, where in rendering to HTML Saxon
reported:
net.sf.saxon.trans.DynamicError: Illegal HTML character: decimal 146
I traced
the error to an article title, where there was an embedded hex character
reference:
Language rights versus speakers’ rights
Unicode
character U+0092 is given as a control character in a private use area. I can’t
see our vendor or any workflow step (un)intentionally adding that character. About
the only thing that makes sense to me is that at some point (probably the
source document), Windows-1252 encoding was used, where decimal 146 is, I
think, a right single quote. (Whether that’s the appropriate character in
this case is another matter.)
So, in
all the XML processes, character U+0092 was passed through as legal, but in
outputting to HTML it is illegal? I’m missing something here, surely.
Curiously,
in my readings, HTML 5 seems to be special-casing Windows-1252 encoding, along
with UTF-8, in that it must be supported:
http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#character-encodings-0
Best
regards,
Mike
Waters
|
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
|
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format
RSS 2.0 |
|
Atom 0.3 |
|
|
Stylus Studio has published XML-DEV in RSS and ATOM formats,
enabling users to easily subcribe to the list from their preferred news reader application.
|
Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website.
they were not included by the author in the initial post. To view the content without the Sponsor Links please
click here.
|
|