[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Character Tugging (Was:Re: SAX2/Java: Towards a final form)

  • From: "Clark C. Evans" <clark.evans@m...>
  • To: David Megginson <david@m...>
  • Date: Mon, 17 Jan 2000 23:24:26 -0500 (EST)

memory allocation for char
On Mon, 10 Jan 2000, David Megginson wrote:
>   public interface ContentHandler
>   {
>     public void characters (char ch[], int start, int length)
>       throws SAXException;
> 
>     public void ignorableWhitespace (char ch[], int start, int length)
>       throws SAXException;
>   
>   }


Context:
   
  1. You have a multi-stage process, where SAX is the
     interface between each stage.
  
  2. Much of your XML document includes Date, Currency,
     and other similar object types; which can take 
     significant parsing time and can always be retrieved
     by Object.toString()

Problems:

 1. In between each stage of my process I have to serialize
    non-mutable application specific objects (Date, Currency, 
    TimeInterval) to a character stream and then re-constitute 
    the characters into the same application.  This seems like
    a waste of memory and processing time.

 2. Lets say that my character content is bigger than the 
    fixed buffer that the parser is using.  This means that
    multiple calls to characters() will be generated for
    the content, correct?   The general case, then, forces
    the use of a StringBuffer on the receiving end if
    the goal is to build the content into a single String.
    Am I correct here?  I'm not sure.

 3. Much of my XML information is sparse, thus a great deal
    of the characters() calls will be wasted.  Question:
    is memory allocated for the char[] passed?  If so, then
    this is unnecessary allocation?

Suggested Solution:

   public interface CharTug {
       boolean isCharacters();
       boolean isWhitespace();
       Reader  getReader();
       Object  getObject();
   }

   public interface ContentHandler {
       public characters(CharTug content);
   }


Problems Revisited:

  1.  Each stage can pass a CharTug, so that
      the Object can be pulled without conversion.

      If the Handler wants a Reader, then
      new StringReader(myObject.toString())
      can be returned.  A helper can automate this.
  
  2.  Two items to note.  

      First, if the content is really huge, then 
      a full blown Reader is ideal.

      Second, if the Handler wants a string, then
      getObject.toString(); will do the trick nicely.
      No need to construct a StringBuffer.

  3.  If the content is not needed, then the 
      CharTug will be ignored, and the execution
      will return to the emitter, which can
      then *skip* the content that could have
      been pulled.


Just trying to solve my pracical problems...


Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1
Please note: New list subscriptions now closed in preparation for transfer to OASIS.



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.