[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Problem parsing XML file with Xerces-J

  • To: Michael Kay <mike@s...>
  • Subject: Re: Problem parsing XML file with Xerces-J
  • From: Midsummer Sun <midsummer.sun@g...>
  • Date: Fri, 1 Apr 2005 13:05:20 +0530
  • Cc: xml-dev@l...
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=pABM317WkY4vS7e0v88reK0q+TIBPKYWFn1j4Q7O+CnagKc8CBXuQxJza8y5KYsNcj3cG57MRwWpF3dEvCVhW3gqZd6ZATZuOEfkouO461UtbC6Wfmzn4OlNRGm21kQzK/9DqPlinJRre38+mzx2Rw8EJj+t5Yp8S5LiqVvlt/o=
  • In-reply-to: <bb5e8b8605033120201ef8eb03@m...>
  • References: <bb5e8b86050331053731edca96@m...> <424c03c3.421dc0eb.67bd.ffffb62bSMTPIN_ADDED@m...> <bb5e8b8605033120201ef8eb03@m...>
  • Reply-to: Midsummer Sun <midsummer.sun@g...>

xerces read xml
> I think pre-editing of response XML (i.e. stripping DTD declration) is
> more better "for me". For my requirement, DTD in the XML is useless to
> me. Implementing EntityResolver imposes significant performance
> overhead to my program. The parser is always pooling for callback
> events.. So I think pre-editing by a simple string method is far
> efficient..

I amend my above observation slightly..

My program is doing:
DocumentBuilderFactoryImpl factory = new DocumentBuilderFactoryImpl();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(rsp)));

So I am using a DOM parser! But a DOM parser underneath is probably
using a SAX handler (to implement a DOM). i.e. a SAX handler is
despatching events to the DOM parser, as it is reading the XML
document. And DOM implementation is constructing a DOM object by
"assembling input from SAX implementation". I read this in a nice
article somewhere.

My class implements EntityResolver interface, and calls
builder.setEntityResolver(obj); i.e. it registers the class object
itself(obj) as a handler for EntityResolver. This is probably a very
lightweight reference within JVM, and is nothing expensive worth
worrying about..

So the DOM parser starts to parse the document. If it encounter a DTD
reference it will call resolveEntity method. It will probably call
this method after a full DOM tree is constructed (so that all entity
references can be resolved). The calling of resolveEntity method will
only be one time. So there I no expensive processing going on, as I
thought before ;)

Please do correct me if I am wrong.

If  the resource consumption by implementing EntityResolver is same as
the pre-editing solution(or there is a very marginal difference), I'll
prefer implementing the EntityResolver interface! It could be a USP in
my application!

I am eagerly waiting for your opinion.

Best regards,

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.