[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: String interning (WAS: SAX2/Java: Towards a final form)

  • From: David Brownell <david-b@p...>
  • To: xml-dev@i...
  • Date: Sat, 15 Jan 2000 19:06:19 -0800

java sax2 example
Tim Bray wrote:
> At 12:13 PM 1/14/00 -0800, David Brownell wrote:
> >The reason not to mandate it is that there are non-parser
> >applictions of SAX, where it's unreasonable to demand that
> >the event source guarantee such interning.
> Really?  Could you expand on that?

I was thinking of examples ... keep in mind that the consumer
of these events could be lots of things:  something that writes
out XML text over a socket, something building a DOM, something
that does XSLT transforms, etc.

FIRST example, a type that I think will be common, is one that
it's actually easy to demand that the interning happen.  Namely,
objects that know how to print themselves as XML, likely an element.

That element structure won't be dynamically determined, at least
in my playbook (Keep It Simple, Stupe!) so that names of elements
and attributes, and namespace URIs, will most naturally be string
constants and hence automagically interned.

Some people may use different playbooks, focussed on their favorite
generic framework for object<-->XML conversions, that may work the
other way around and find interning to be extra work.

SECOND example could reasonably be viewed as a kind of parser:
it's something that walks all or part of a DOM tree.  With such
trees there's no guarantee that interning is done on names or URIs.

Trees built by hand will _sometimes_ use literals (interned), trees
built using SAX parsers will often have interned strings (viz. many
previous discussion), but there are also ones built using other tools,
say databases with element names, for which the names/URIs wouldn't
normally be interned.  I've seen all three ways to build DOMs.

THIRD example is similar to a combination of the previous two:
someone uses a custom data structure, lighter weight than DOM but
general enough to handle all their data, and then uses that data
to regenerate a stream of SAX events (sent to socket, etc).

This one's intentionally a bit hand-wavey, since the goal is to
optimize for some problem to which XML (and SAX, and DOM) are very
much incidental.  Since the structures are task-optimized, it's
not certain that interning will be desirable.

Now in all of those cases one could define a postprocessor that
interns all the strings going through startElement()/PI()/... and
so on, but that can be a lot of extra work that may not be needed.
And extra work is always undesired.

Ergo my feeling that it's better to just expose whether the event
producer is doing the interning, than to require it always be done.


> Mind you, this debate on what is really a fairly minor piece of SAX
> is probably coming approaching a negative cost-benefit ratio.

You noticed too?  :-)

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1
Please note: New list subscriptions now closed in preparation for transfer to OASIS.


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.