[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Fw: ModSAX: Proposed Core Features

  • From: "Oren Ben-Kiki" <oren@c...>
  • To: "XML List" <xml-dev@i...>
  • Date: Thu, 11 Mar 1999 17:48:46 +0200

sax normalize text true
I asked:
>> Has anything similar [assembling processors based on feature requests]
>> been done in a different field, so we could reuse the
>> design lessons there? It seems like a pretty generic "stream processing"
>> problem.

Ronald Bourret <rbourret@i...> wrote:

>I think there is an inherent assumption in this question that we are
>defining individual features that can be implemented by different parties
>and then randomly assembled to get a useful processor.  While this is
>potentially a useful thing to do -- UNIX pipes are a good example -- it is
>not necessarily an easy thing to do, nor is it clear that this is a goal of
>ExModE-XSAX.

Well, at least the idea warrants some serious thought.

>We tried to do a similar thing in OLE DB, where database functionality
>would be broken down into individual services which could be assembled at
>will on top of a database driver.  (Generally, this would be meaningful
>only for drivers for non-database sources, as drivers for existing
>databases already exposed most/all functionality.)  The idea never really
>worked out, but here are some of the issues:
>
>* Are there enough useful features/components to make this worthwhile?

Good question. For SAX I'd say "probably yes". Here's a list of features
(courtesy of David Megginson):

> http://xml.org/sax/features/validation
>  Validate (true) or don't validate (false).
> http://xml.org/sax/features/external-general-entities
>  Expand external general entities (true) or don't expand (false).
> http://xml.org/sax/features/external-parameter-entities
>  Expand external parameter entities (true) or don't expand (false).
> http://xml.org/sax/features/namespaces
>  Preprocess namespaces (true) or don't preprocess (false).  See also
>  the http://xml.org/sax/properties/namespace-sep property.
> http://xml.org/sax/features/normalize-text
>  Ensure that all consecutive text is returned in a single callback to
>  DocumentHandler.characters or DocumentHandler.ignorableWhitespace
>  (true) or explicitly do not require it (false).

I'd like to see "http://xml.org/sax/features/xsl-transformation" as well.
Anyway, all of the above seem to fall nicely into the pipeline framework.

>* What are the interfaces between components and how hard are they to
>implement?

Basically the SAX callbacks, probably extended so that the full document
data is available (comments and so on). This seems pretty much a done deal.

>* How independent are the features?
>* Are there order dependencies between components?

This is a problem, as I've already pointed out. Take "normalize-text", for
example. The effects of such a filter might be lost if it is followed by any
of the entity expansion filters (say), not to mention an XSL one. However
most of the other features seems relatively independent. I'd say this isn't
a fatal problem. It definitely doesn't effect the API I suggested.

>* Are performance penalties too high to separate features into separate
>components?

Unknown; I guess this depends on the feature and the implementation. But
then, allowing one to build a system by combining filters doesn't mean one
has to do so. Even inefficient pipelines are still very useful for ad-hoc
processing, for prototyping systems, and so on. From the list of features
above, I'd say that most won't suffer a serious penalty.

>* Who assembles the components -- the application, the processor, or a
>third party?

What I'm suggesting is we currently answer "for now, the application", and
provide a simple, lightweight, low-level API which allows it to do so. More
complex solutions could evolve later on. This seems to be in the SAX spirit.

>My personal feeling is that assembling XML processors completely on the fly
>is a pipe (if you will excuse the pun) dream.  The world is simply not o
>rthogonal enough to make this possible.  Furthermore, there are too many
>performance gains to be had by tight integration of functionality to ever
>convince people to build things entirely as components with public
>interfaces.


Simon St.Laurent has made a good case for layering XML functionality - see
http://www.simonstl.com/articles/layering/layered.htm. The list of features
above seems to validate his claims.

My feeling is that pipelining is a valid approach. This is because there are
quite a few features which fit this model, and each application needs its
own special subset of them. If this weren't the case, we'd be designing
SAX2.0 with a fixed set of features instead of ModSAX.

Have fun,

    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.