- From: "Randy McGarvey" <rmcgarvey@g...>
- To: "Michael Kay" <mike@s...>,<xml-dev@l...>
- Date: Tue, 30 Oct 2007 10:02:57 -0400
Title: Message
Thanks for the notes on How to solve this issue. I was
really hoping to get a different answer! :-) I hadn't considered
modifying the entity file or using processing instructions to protect the
entities from being resolved.
Can
anyone address the Why and include the perspective of a parser requirements
writer / standards committee member? To me, this seems like valuable
functionality that is lacking from the current tools.
>> Randy
It's a real pain that doesn't have a common solution. I
tend to
(a) avoid using entities. Because I only ever use XML via
XSLT, processing-instructions are much more manageable.
(b) if I do use entities, don't rely on them remaining
intact - i.e. there should be no difference in information content between an
entity and its expansion (so you can always re-entitize mechanistically if you
need to).
(c) preprocess, as suggested, to replace the ampersands
by something else: for example <?ent mdash?>.
Michael Kay
http://www.saxonica.com/
If I have data with character
entities such as § or — in the XML, what is the best way
to keep those intact, as is, after a parse. Are there any parsers that
have an option not to resolve entities? What is the best way you've
found to deal with this issue? Do you escape the ampersands (e.g.
&sect;) in a pre-process? Do you address it in an entity
handler to re-write the original entity text? This seems like a real
pain that must have a common solution.
Thanks! >> Randy
******************************************************************************
Do you get frequent requests for copies of certain sections of your Code?
We can reproduce chapters of your Code in handy pamphlet format - no minimum
quantity required! Order yours today.
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
|