[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XML processing experiments

  • From: Jarle Stabell <jarle.stabell@d...>
  • To: xml-dev@i...
  • Date: Fri, 07 Nov 1997 17:32:55 +0100

xml 38 ampersand
James Clark wrote:
>> Given XML's requirements that entity references in the instance are
>> synchronous, I would have thought that the overhead of an entity stack
>> could be avoided for parsing the instance.  The parser passes the
>> application an entity reference event, and the application can then, if
>> it chooses, recursively invoke the parser to parse the referenced
>> entity.

Richard Tobin wrote:
>Entity references are expanded, and a bit may end in a different
>entity from the one it started in (suppose foo is defined as "a<b/>c";
>then the first bit returned from "x&foo;y" is "xa" - as far as I can
>tell this is quite legal XML).

I don't think this is legal. The working draft (sec. 4.1) says:
"The logical and physical structures (elements and entities) in an XML
document must be synchronous. Tags and elements must each begin and end in
the same entity, but may refer to other entities internally; comments,
processing instructions, character references, and entity references must
each be contained entirely within a single entity"


It seems to me that with the current whitespace handling, one could nearly
(?) parse the entities locally, and build a subtree of it if the tree is
wanted. (This could maybe result in easier error-reporting, and would
probably have a positive impact on parsing speed (but could mean a bit more
complexity in the implementation?))

As Mr. Clark indicates, a parser doesn't need to take much of a performance
hit when entities are not present, the entity stack have no influence (is
kept constant) when parsing f.i. a start-tag.
(if entity references are present in the attribute values, this can be
expanded afterwards if wanted. Authoring tools etc often don't want this
expansion to happen.)

I (currently!) think it is possible to design a 'real' parser looking
locally much the same as Mr. Clark's "quick and dirty" parser.
(I'm in the startup implementing one)


BTW: Anyone having an example of where the immediate expansion of character
references within 
internal entities actually comes handy?
To me this seems to make the parser use more memory and perhaps being
slower, but more importantly: ruins copy-paste semantics of entity expansion

What will "normal" people think about such things as the example from the
draft:

<!ENTITY example "<p>An ampersand (&#38;#38;) may be escaped
numerically (&#38;#38;#38;) or with a general entity
(&amp;amp;).</p>" >


I think most people will regard this as a bug/design flaw.

I would feel better if I knew an example where this behaviour actually
comes handy... :-)



Cheers,
Jarle Stabell


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.