[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: nestable C/C++ XML parser?

  • From: "Thomas B. Passin" <tpassin@i...>
  • To: "xml dev mailing list" <xml-dev@i...>
  • Date: Wed, 8 Dec 1999 08:28:51 -0500

simple xml parser in c
----- Original Message -----
From: Paul Miller <stele@f...>

> I'm trying to develop a tag-based front-end to expat and having no luck.
> I'd like to be able to parse an XML document in nestable chunks, by
> calling into a nestable parser. In other words, I'd like to start
> parsing, then branch to a function to handle a specific element, parsing
> in there until that element is closed, then fall back out of the
> function to continue parsing the rest of the document.
>
I take it that you want to be able to ignore part of the doument, and only
process the pieces you are interested in.  Is that right?  Then each piece
would be valid XML if it were enclosed in a root element.  You don't need to
literally do what you have suggested. That is, "parse in there...".  You do
need to parse handle the elements of different pieces differently.  Three
approaches come to mind.

1) Preprocess to extract just the pieces you want, wrap them in root
elements so they are complete documents, then run expat (or whatever)
separately on them using SAX. The preprocess should be fast and easy, and
perhaps could be done using regular expressions, or SAX.  Alternatively, if
the xml is relatively simple, don't wrap the fragments, and process them
using regular espressions insstead. (Search this archives of this group for
the last few months to find a reference to "shallow parsing using regular
expressions").

2)  You really are talking about a state machine, I think.  That is, if you
have reached the right piece of the document, you go to a different manner
of handling the elements (they will still parse the same, it's just the
handling that would be different).  So you could explicitly maintain a state
variable and have the SAX (or whatever) callbacks behave differently
according to the state.  This would be conceptually simple but might be a
pain to implement depending on how many different element handlers you will
use.

3) Again as a state machine, you could use a function pointer to specify the
callbacks, and when you change state you change the function pointers to
point to different handlers.  I don't know whether you would have to modify
expat to do this or not, but changes should be minor if needed.

Regards,

Tom Passin



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@i... the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2011 All Rights Reserved.