[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: SAX2: LexicalHandler

  • From: David Megginson <david@m...>
  • To: XMLDev list <xml-dev@i...>
  • Date: Wed, 22 Dec 1999 11:23:36 -0500 (EST)

lexicalhandler
David Brownell writes:

 > For DOM Level 2 support, the literal text of the internal subset
 > needs to be provided.

You're kidding!  That's disgusting -- I'm strongly tempted just to
leave the DOM people dangling on that one.  After all, the proposed
SAX2 interfaces provide enough information to construct an equivalent
internal subset.

 > 
 > >     public void startEntity (String name) throws IOException;
 > >     public void endEntity (String name) throws IOException;
 > 
 > A bunch of restrictions to this were identified as being essential,
 > such as the fact that entities expanded within other constructs
 > mustn't be exposed.  For example:
 > 
 > 	<!ATTLIST foo %std-attrs; %i18n-attrs; %gooey-attrs;>
 > 
 > 	<element foo="&entity1;" bar="&entity2;" />

Agreed.

 > I'm hoping the full spec for those callbacks makes clear that
 > in such situations the entities MUST NOT be reported.  (And
 > would strongly prefer that parameter entities never show up
 > in any context whatsoever.)

To tell the truth, I don't think that many people really need any of
this stuff, so it's hard for me to distinguish one type of noise from
another.  If I were dictator, the only things I'd put in SAX2 would be
property/feature queries and Namespace support.

 > The reason was briefly that applications can't see inside the
 > structure of those constructs -- they'll just see some start/end
 > entity calls, FOLLOWED (oops!) by the callback of which they're
 > a part.  Just like they would if the entities preceded that
 > construct.

Agreed -- entity boundaries inside attribute values are forever lost.

 > >	 I wonder if a little
 > > redundancy would make sense:
 > > 
 > >     public void startEntity (String name, String publicId,
 > >                              String systemId) throws IOException;
 > >     public void endEntity (String name) throws IOException;
 > > 
 > > That way, if the parser supports the LexicalHandler but not the
 > > DeclHandler, the public and system identifiers for entities will still
 > > be available.
 > 
 > That wouldn't handle internal entities, though.

For internal entities, both publicId and systemId would be null, and
the value would be the text that appears before the corresponding
endEntity callback.

 > I have fundamental issues with the notion of exposing the entity
 > structure of documents beyond that needed to recreate the DOCTYPE
 > declaration (DTD).  Not just in SAX; DOM does it pretty poorly too
 > (children of entity refs must be readonly, making them impossible
 > to manipulate in typical ways).

Yes, I know -- that's why I want (at least) to make all of this mess
optional.  XML is simple at heart, but not when they start letting API 
writers loose on it.

 > So I'd really rather not see that particular thing done ... if
 > any substantial change is to be made to entity reporting, my vote
 > is to just drop it entirely.  It's too messy a notion (IMHO) to
 > show up in any API offering higher level notions than lexical
 > tokens. (angle bracket, name, space, name token, space, equals,
 > double quote, text, entity ref, text, double quote, angle bracket,
 > text ... you get the idea.)

I'd like to leave it out as well.  Personally, I think that the XML
community would be better served if purely lexical items like
Namespace prefixes, the DOCTYPE declaration, comments, element type
declarations, entity boundaries, etc. were simply inaccessible through
any standard API -- that way, the APIs would be easier to learn and
the obfuscators of the world would be less likely to abuse them.

I am tired, however, from all the e-mails from DOM implementors who
want comments (for example) in SAX so that they can bloat their DOM
trees with them.  They're wrong, of course, but I'm too tired to fight 
any more.


All the best,


David

-- 
David Megginson                 david@m...
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@i... the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.