[Home] [By Thread] [By Date] [Recent Entries]

  • From: David Megginson <david@m...>
  • To: xml-dev@l...
  • Date: Fri, 22 Sep 2000 14:02:12 -0400 (EDT)

Alex Milowski writes:

 > In the ContentHandler interface, there is a method called character()
 > which allows the processor to pass the character data that is a child
 > of an element to a processing application.  If you introduce XML Schemas,
 > this allows one to create a streaming type factory to construct the
 > actual type instance without having to first instantiate a Java
 > string--which is very good from an optimization standpoint.

Yes, although Java Strings are much more efficient than they used to
be, at least in the Linux VM's.  I remember running some tests a
couple of years ago when Tim Bray suggested that string allocation was
expensive, and the overhead of allocating thousands of strings turned
out to be negligible.  I think that JDK 1.1 must have fixed some
problems there.

 > Unfortunately, the same concept does not exist for attributes.  An
 > attribute's value is already been constructed into a Java string before
 > the application can receive the lexical representation.  This seems rather
 > unforunate for XML Schemas and optimization since the typing of "leaf
 > nodes" within an XML document is uniform for attributes and element child
 > content.

This was a matter of much discussion during the original SAX 1.0
design, and most people preferred it this way.

 > Is it too late to fix this?  This would seriously help in building
 > optimized XML Schema aware processors.

Yes, it's too late to fix this, at least for now -- I intend a bug-fix
release soon, but no major API changes for a while (except extensions,
which are outside the SAX2 core).  I'd be interested in seeing some
profiling data to see how much the string allocation is actually
costing.

Note that a parser (though not a filter, obviously) could perform lazy
allocation of strings -- that might help a bit.


All the best,


David

-- 
David Megginson                 david@m...
           http://www.megginson.com/

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member