[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XML too hard for programmers?

xml mismatched tag
> if you want minimal memory overhead (and not just create DOM and
> navigate it) you can record XML context of one position in file (that
> would include i-scope namespace declarations, stack of start tags,
> attributes etc.)  and use it to move back parser and then restart
> parsiing from this position though i have not seen parser that can do
> this ...

My parser does that.  For example, when I parse an ebook, I lay it out
a page at a time, and mark the position in the XML of the content that
starts each page.  I then write an index file containing all the marks
for each page.

Once a document has been indexed, it's very quick to, say, open the
document and jump to the 200th page, or to jump back quickly page by
page, without storing all the XML for each page.

The drawbacks are that: a) if the document changes, you have to
reindex everything and b) if any of the display attributes (e.g.  text
size, line spacing, etc) changes, you have to reindex everything.

All I record for a mark is the offset in the file, the read depth and
the tags of each level of nesting.  I don't know anything about
i-scope namespace declarations (I said I was hopelessly naive!)

> and here is how it could be done in XmlPull (for details see: 
> http://www.extreme.indiana.edu/~aslom/xmlpull/patterns.html#ANY_ORDER)
> 			 wrapper.skipSubTree();

I think the advantage of having the nesting level explicit in the
parsing is that the parser is in a position to deal reasonably
robustly with malformed XML, without aborting.

I started off aborting with an error on any mismatched tag, but I
found that in practise, files I was finding on the net had a plethora
of minor errors, and fixing them is much easier if the parser gives
warnings for many errors in the same document (sometimes there are
hundreds of errors) rather than aborting at the first one...

Of course, skipSubTree could do something like that, but it has not
got the option of ascending further up in the tree than the level at
which it was called, which is sometimes the best thing to do
(depending on your recovery heuristics, obviously).



Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.