[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: SAX2 Lexical Handler Suggestions
I can see that you come from the same camp as most XML programmers I've encountered, which I would sum up as "XML is for documents, why would you want it to do anything more". Our system is a web based authoring system that allows the XHTML/AML (our own markup extensions) to be parsed once and held in memory in an executable tree so that when each request comes in for that page a completely new HTML document can be created from the executable XHTML template by simply running the chain. If all the attributes (which we use as Dynamic macros) are silently replaced by the parser, we have no way of Dynamically inserting the proper macro value into the generated HTML. As far as knowing the actual attribute's raw text I need some way to match which attributes were entities that were expanded by the processor and which were entered by the user and having the information about which are default attributes and which were actually from the XML document is very handy to save space in the downloaded HTML. I know that SAX parser absolutely can handle these changes, since I have a modified version of the AElfred2 parser that currently supports them. (*Chris*) ----- Original Message ----- From: "Vilya Harvey" <vilya.harvey@o...> To: "XML Development Interest Group" <xml-dev@l...> Sent: Wednesday, July 12, 2000 1:55 AM Subject: Re: SAX2 Lexical Handler Suggestions > Chris Pratt wrote: > > First of all, what is the utility of the startEntity() and endEntity() > > methods of the LexicalHandler? The end of the entity has definitely been > > parsed before the call to startEntity since the name of the entity is a > > parameter. And since a single entity can't bracket information (like an > > element does) there is no utility in the endEntity() method, unless I'm > > missing something obvious. In this case, I would suggest we rename the > > startEntity() method to simply entity() and remove the endEntity() method. > > startEntity() and endEntity() respectively indicate the start and end of the > _replacement_text_ of an entity reference. This replacement text may itself > contain characters which cause other handler methods to be invoked; that's why > there needs to be start & end methods for it. This could probably be made a > little clearer in the documentation. I think the confusion mainly springs from > the common misuse of the term 'entity' to mean 'entity reference', while SAX > uses the term in it's proper sense to mean the block of characters that are > indicated by the reference. > > > Also many systems (mine included) need to be able to tell the difference > > between an entity in element data and an entity as an attribute value, so > > I'd suggest adding a boolean parameter to the entity() method specifying > > which of the two possible uses of entities has been found (i.e. public void > > entity (String name,boolean isAttr);). > > Because of the way SAX has been designed to work, entities in attributes cannot > be reported. They are just resolved silently by the parser. > > Why do you need to know when an entity is being processed? Apologies in advance > if this treads on your (or anyone elses) toes, but I generally find that this > "need" to know stems from a misunderstanding of their purpose. In some ways > they are analogous to preprocessor macros in C: they get expanded and the > result is processed as if it was part of the original document. The only thing > that needs to know about preprocessor macros is the compiler (for generating > debug information); likewise, the only thing that really needs to know about > entity references (generally speaking) is the parser. They are essentially a > shorthand mechanism, although they also allow external documents to be > included. > > > My final suggestion is a method to detect when attributes are encountered > > being that there is a great amount of information that can be disseminated > > in an attribute definition. My current hack of the AElfred parser defines a > > SAX2 Extension handler that supports the following call: > <snip> > > All of the information in your attribute() method, with the exception of the > raw attribute text, is available through the existing interface. As regards the > raw attribute text, see my comments above. If you really do want to process > entity reference names (particularly in attributes), I would suggest grabbing > the source code to a parser (you obviously already have AElfred) and adding > your own non-SAX API to it, rather than changing SAX to support this. > > Hope that helps, > Vil. >
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|