[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: how to skip parsing of a subtree?


sax subtree
At 06:46 -0700 2005-10-10, Anil Philip wrote:
>Hello,
>I am using SAX/Java to parse through a tree of nodes
>looking for a particular subtree. Each node has a
>nodeId attribute (an int).
>When I find the subtree, I want to parse it into
>nodes. However it would be a waste to parse the rest
>of the xml file; how do I skip parsing the rest?
>Parsing the rest of the file would be simply
>unnecessary.
>(I assume SAX is pre-order tree traversal).
>thanks,
>Anil

Skipping the part *after* the subtree you want is easy: just shut 
down the parser.

Skipping the part *before* may be hard. If your tree is headed by

    <foo nodeID='65536'>...

*and* if you're certain that nothing like "nodeID=" won't show up in 
content, or in a comment or PI, etc. then you can just scan for it.

Once you find it, you have to tell SAX to start at that certain 
offset in a file. You might try creating your own EntityResolver, and 
when it's asked to supply the appropriate entity, you open the file 
and  seek it to the right place before handing it to Sax in the first 
place.

Please note that there are quite a few things that could mess up a 
simple scan. For example:

    * The nodeID could show up in a comment, PI, content, another 
attribute, etc.

    * The attribute value could use the other kind of quotes and you'd miss it.

    * If there are entities around, your scanner would have to handle them, too.

But if you can ensure that none of the tricky cases come up, 
something like this should work.

Steve

-- 
Luthien Consulting: Real solutions to hard information management problems
    Specializing in information design, XML, schemas, XSLT, and 
project design/review/repair
Steven J. DeRose, Ph.D., sderose@a...

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.