[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: String interning (WAS: SAX2/Java: Towards a final form)

  • From: Tyler Baker <tyler@i...>
  • To: Miles Sabin <msabin@c...>
  • Date: Wed, 12 Jan 2000 16:29:18 -0500

intern in java
Miles Sabin wrote:

> Tyler Baker wrote,
> > Miles Sabin wrote,
> > > [snip: table mapping to intern'd Strings]
> > > Even tho' this only requires one java-intern for each
> > > distinct name it still provides plenty of opportunities for
> > > synchronization collisions.
> >
> > Nope. Names in XML are highly redundant especially for
> > Namespace prefixes. Also, even if the number of calls to
> > String.intern() were significant (which they rarely if ever
> > are), modern Java runtimes have lowered synchronization
> > overhead to be small enough that you don't really have to
> > think about it much in terms of impacting performance
> > anymore.
>
> I think you're making two assumptions that don't always hold.
> Not all java xml applications are one shot, single doctype:
> some continuously parse multiple documents of a variety of
> doctypes in multiple threads. There's not necessarily _any_
> particular upper bound on the number of distinct element and
> attribute names that might be encountered. So there could be
> continual contention for the JVM's intern table.

Well of course there is never an upper bound for the number of distinct element names or
attribute names in a document, but in general you usually have exponentially more elements
and attributes than you do distinct element or attribute names. Trying to satisfy a
condition that will never happen in the real world of how XML will be used, is exactly the
same wrong mode of thinking that I think led to how "Namespaces in XML" came about. The
designers I feel tried to satisfy all of these hypothetical conditions, without ever
thinking about the real world implications. This is what you are doing here which is
laudable, but I don't think really has anything to do with real world use of XML.

> And I think you're assuming a single processor JVM. The
> synchronization overhead picture is *very* different on multi-
> processors.

Synchronization is synchronization. For most documents, making a call to String.intern()
50-100 times in a 100KB document is a lot less expensive than doing:

if (x.equals("foo") {

}
else if (x.equals("bar") {

}
etc...

As opposed to:

if (x == "foo") {

}
else if (x == "bar) {

}
etc.

Calling the equals method can get expensive for large case statements.

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1
Please note: New list subscriptions now closed in preparation for transfer to OASIS.



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.