[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: YAXPAPI (Yet Another XML Parser API)- an XDEV proposal

  • From: David Megginson <ak117@f...>
  • To: Tim Bray <tbray@t...>
  • Date: Sat, 13 Dec 1997 21:01:04 -0500

microstar xml parser api
Tim Bray writes:

 > >  attribute(XmlParser, String, String, boolean) 
 > 
 > It seems completely wrong to have an attribute event separate from
 > start-element events.

I have worried about this myself.  My design goal with Ælfred has been
to limit myself to two class files: one for the parser itself, and one
for the interface for the callbacks -- hence the separate event for
attributes.  This decision has forced some pretty severely hacked-up
internal code accompanied by very careful documentation.

I could send a hashtable of attribute names and values with the
startElement() callback, and let users look up types (etc.) with my
query methods, but I would have to lose a bit on two counts:

1) Allocating a new hashtable for every start tag will slow down the
   parser a fair bit.

2) I'd have no way to show which attributes were specified and which
   were defaulted (see below).

 > What's the boolean?  I don't think the application author should
 > to have to deal with anything but the name and value of attributes.

The boolean tells whether the attribute was specified or defaulted.  I
include this to allow people to do useful XML-to-XML transformations.

 > >  data(XmlParser, String) 
 > 
 > I feel that the 2nd argument should not be a String.  It is a recipe
 > for disastrous inefficiency if the processor has to cook up a 
 > java.lang.String object for every little chunk of text.  

The overhead isn't that bad with Ælfred because I coalesce my data
into the largest chunks possible before allocating the String.  I
think that returning a char[] array would be confusing for users, and
would lead to many bugs in their code as they ignored our warnings not
to rely on the value in the char[] array outlasting the callback.

 > Lark uses two
 > arguments, a char[] array and a character count; the app can
 > make a String if it needs to.  If you find this awkward, create
 > a new data type called Text so that if you need a String you
 > can make it with lazy-evaluation in Text.toString(), but if you
 > don't need it you don't build it.

Again, I'm reluctant to create new classes beyond XmlParser and
XmlProcessor.

 > Also, it shouldn't be named "data" - it should be named
 > characterData or charData or text or some such term that can
 > be mapped directly to the spec.

Agreed.  I will not change Ælfred now, but I think that this is a good
idea.

 > >  resolveEntity(XmlParser, String, String, URL) 
 > 
 > I don't think entities have any place in the first cut of this 
 > interface.  The processor exists to make these problems go away.

Normally, you should just return the URL argument; however, this
callback gives users a chance to do public-identifier resolution, URL
substitution, etc., and to return a different URL if desired.  For
example, if we had a DTD at

  http://www.microstar.com/XML/msldoc.dtd

and you had a local copy, you could substitute a local URL on your own
computer.  Likewise, you could do a catalogue lookup on the public
identifier "-//microstar//DTD Microstar Sample Document//EN" and
choose a different system identifier than the default supplied in the
document.

That said, I agree that this probably doesn't belong in the common
event API.

 > Generalities: 
 > Lark has a thing where if any callback returns 'true', the
 > parser drops out of its loop... which is awfully useful and easy
 > I think.  Lark will also re-enter, but this need not be a requirement.

Awfully easy with a DFA-driven parser, but trickier with a
recursive-descent parser like Ælfred.  I'd probably have to throw an
exception, and could not allow any kind of re-entry.

 > Also, for application programmers, especially dealing with smallish
 > objects, a tree interface is very natural.  I've written both
 > event-stream and tree apps using Lark, and the trees are a lot
 > easier to use for anything even moderately complex.  So the API 
 > should have Element, Attribute, and Text classes. 

Perhaps -- I may have to give in an allow Ælfred to use more than one
class file; or alternatively, these would be an optional extra, along
with the SAX-J layer.

 > And it shouldn't (sorry Peter) be called YAXPAPI - how about SAX, Simple
 > API for XML?  Maybe SAX-J for the Java bindings. -Tim

How about RUSTY?


All the best,


David

-- 
David Megginson                 ak117@f...
Microstar Software Ltd.         dmeggins@m...
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.