Re: String interning (WAS: SAX2/Java: Towards a final form)
Tim Bray wrote: > > At 12:13 PM 1/14/00 -0800, David Brownell wrote: > >The reason not to mandate it is that there are non-parser > >applictions of SAX, where it's unreasonable to demand that > >the event source guarantee such interning. > > Really? Could you expand on that? I was thinking of examples ... keep in mind that the consumer of these events could be lots of things: something that writes out XML text over a socket, something building a DOM, something that does XSLT transforms, etc. FIRST example, a type that I think will be common, is one that it's actually easy to demand that the interning happen. Namely, objects that know how to print themselves as XML, likely an element. That element structure won't be dynamically determined, at least in my playbook (Keep It Simple, Stupe!) so that names of elements and attributes, and namespace URIs, will most naturally be string constants and hence automagically interned. Some people may use different playbooks, focussed on their favorite generic framework for object<-->XML conversions, that may work the other way around and find interning to be extra work. SECOND example could reasonably be viewed as a kind of parser: it's something that walks all or part of a DOM tree. With such trees there's no guarantee that interning is done on names or URIs. Trees built by hand will _sometimes_ use literals (interned), trees built using SAX parsers will often have interned strings (viz. many previous discussion), but there are also ones built using other tools, say databases with element names, for which the names/URIs wouldn't normally be interned. I've seen all three ways to build DOMs. THIRD example is similar to a combination of the previous two: someone uses a custom data structure, lighter weight than DOM but general enough to handle all their data, and then uses that data to regenerate a stream of SAX events (sent to socket, etc). This one's intentionally a bit hand-wavey, since the goal is to optimize for some problem to which XML (and SAX, and DOM) are very much incidental. Since the structures are task-optimized, it's not certain that interning will be desirable. Now in all of those cases one could define a postprocessor that interns all the strings going through startElement()/PI()/... and so on, but that can be a lot of extra work that may not be needed. And extra work is always undesired. Ergo my feeling that it's better to just expose whether the event producer is doing the interning, than to require it always be done. (phew!) > Mind you, this debate on what is really a fairly minor piece of SAX > is probably coming approaching a negative cost-benefit ratio. You noticed too? :-) - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1 Please note: New list subscriptions now closed in preparation for transfer to OASIS.
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format