[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: String interning (WAS: SAX2/Java: Towards a final form)

  • From: Tyler Baker <tyler@i...>
  • To: Assaf Arkin <arkin@e...>
  • Date: Mon, 17 Jan 2000 18:00:07 -0500

xerces string interning
Assaf Arkin wrote:

> Tyler,
> I am aware of how to perform interning. I wrote OpenXML which performs
> interning for SAX and DOM, and I'm a contributing member of XML Apache,
> so I'm also familiar with their mechanism.
> Yet, aside from parsers and DOMs I use SAX in a variety of applications
> that do not perform String interning, nor is there any benefit for them
> to do so. I'm afriad that mandating interning will simply break these
> (and many other) applications.

It won't break SAX 1.0 because it is not a mandated feature. For SAX 2.0 implementations, these
applications will need to support the SAX 2.0 API anyways. Having interned String support
regardless of the application is mostly trivial, but the benefits at the application level can
be immense if performance is at all a consideration in your applications. Really it depends on
the size of your document. For web browsers, interning or not interning is no big deal because
the documents are not that large anyways. I/O is pretty much always your bottleneck and not the
parser, even if the parser is very inefficient.

> Also, both OpenXML and Xerces use their internal interning mechanism
> which is substantially faster than String.intern, especially for dealing
> with DOM and parsing, however, the following will never work in either
> OpenXML or Xerces:
> if ( tagName == "foo" )
> for the simple reason that their interning mechanism and String.inter do
> not share the same table.
> arkin

The entire point of using String.intern() is to make the application which uses the parser
framework faster and not in a way which makes you have to write code like this:

public static final String CONSTANT = GlobalStringInternTable.intern("foo");

As a developer I prefer to use the least number of proprietary hooks as I possibly can. Using
some GlobalStringInternTable I think would only make sense for namespace support if you had a
parser framework that presented the application with a Name object instead of three strings
consisting of the prefix, namespace, and local part.

I guess it is just an argument mostly about what you want the application developer to deal
with. For me I prefer the way that gives me maximum performance without any obtuse coding to
some proprietary string table interface.



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1
Please note: New list subscriptions now closed in preparation for transfer to OASIS.


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.